Data Platform/Evaluations/2021 data catalog selection/Rubric/Amundsen
Core Service and Dependency Setup
Ingestion Configuration
Progress Status
Perceptions
Outcome
Razzi's take on Amundsen
Pros:
- simple architecture of 3 flask services all in python (as opposed to Datahub using java and python)
- ingestion architecture is simple: python scripts or airflow dags that make http api requests
- "social" ui features, like frequent users and owners
- loose coupling means you can use a relational database as the data store rather than neo4j (https://github.com/amundsen-io/amundsenrds)
Cons:
- seems like the community is losing steam: https://github.com/amundsen-io/amundsen#blog-posts-and-interviews has a flurry of events in 2019/2020 but nothing in 2021
- only supports polling for data updates, unless we also deploy atlas. Push ingest api is on their roadmap
- documentation is somewhat lacking; few ingestion examples, and broken links in docs
- some dependencies are getting out of date: elasticsearch version 6 (v7 was released 2019), nodejs version 12 (v13 was released 2019)
Amundsen was created by Lyft and is now hosted by the Linux Foundation.