Fundraising/techops/docs/analytics stack
Appearance
< Fundraising | techops
Analytics Stack Components
| System | Description | Host:Port | Log Location |
|---|---|---|---|
| Trino | Trino is a query engine used to process analytics data in FR-Tech's datalake. the current implementation is composed of one coordinator node and 3 worker nodes, each on their own host | Coordinator- fransc2001:8443
Workers-
|
<host>/var/lib/trino/trino-data/var/log
Query history in trino or mariadb under
|
| Dagster | Dagster is an orchestrator used to schedule jobs that load or manipulate data | fran2001:3000
|
syslog, tagged with 'dagster'. note that in vb the logs print to stdout, not syslog since we aren't using systemd |
| dbt | dbt (data build tool) is an open source framework for modeling data. dbt is basically an orchestrator for sql commands that ensures the commands run in a way that respects each data model's up/ downstream dependencies | fran2001
|
syslog for the dagster materialization logs, general dbt logs are in: /srv/dagster_data/dbt_log/dbt.log
individual run logs are in: |
| Hive Metastore | Hive Metastore holds metadata used by Iceberg and Trino to map Trino tables to file locations in minIO | fransc2001:9083 or the same host as the Trino Coordinator
|
syslog |
| Metabase | Metabase is a Business Intelligence tool used to visualize and analyze data. FR Analytics is in the process of migrating from Apache Superset to Metabase | fran2001:9081
|
syslog |
| Superset | Superset is a Business Intelligence Tool. We are migrating from Superset to Metabase | fran2001:9080
|
syslog |
| minIO | minIO is an object storage tool that holds our datalake. the physical data files are stored in minIO | franio200[1-3]:9000
|
not currently logging |
| MariaDB | MariaDB is a database used by CiviCRM to hold all its data | frdb2003
|
syslog and /srv/sqldata/{hostname}-slow.log
|
| dlt | dlt (data load tool) is an open source framework for loading data to and from various APIs and databases | fran2001
|
syslog, but not currently tagged. all dlt logs are probably tagged with dagster since dlt scripts run through dagster
|
syslog is in /var/log/syslog any host
Analytics How-to's
Fundraising/techops/docs/analytics stack/how to guides
Troubleshooting
Fundraising/techops/docs/analytics stack/troubleshooting