Inside Authorized Workspaces, A New Feature in Astro
Astro customers can isolate teams or projects to specific clusters in their data planes, ensuring that data goes where it’s supposed to go.
Introducing Support for the Kubernetes Executor on Astro, Now in Private Preview
The K8s Executor offers Astro customers task isolation, efficient resourcing, and simplicity.
Autodesk Uses Astronomer's Airflow-Powered Orchestration to Support Its Cloud Transformation
When Autodesk outgrew its Oozie-based workflows, it partnered with Astronomer’s Professional Services team to migrate hundreds of critical DAGs.
CRED Boosts Data Pipeline Speed and Reliability with Fully Managed Airflow from Astro
CRED migrated to Astro to align business growth with a scalable Airflow infrastructure.
Get Improved Data Quality Checks in Airflow with the Updated Great Expectations Operator
The refactored GreatExpectationsOperator boosts ease of use and flexibility.
Introducing Astro’s New Workspace Homepage
The updated interface features an overview of recent task and DAG runs and resource guides for new and advanced users.
Today, we shared some important business updates with the company.
The Airflow Year in Review 2022
Data-dependent scheduling, dynamic task mapping, and UI improvements were standout updates in Airflow's busy year.
Win a Scholarship for CoRise’s “Effective Data Orchestration with Airflow” Course
Astronomer is giving away five full scholarships to CoRise’s new Airflow and data orchestration course.
The New, Faster Way to Deploy Airflow DAGs to Astro
Pushing Airflow DAGs to Astro separate from the rest of your environment files means faster deploys, no downtime for your Deployment, and a CI/CD process that best fits your team.
How an Improved DAG-Testing Command in the Astro CLI Made Its Way into Airflow
Updates to the Astro and Airflow CLIs make DAG authoring easier and help DAGs run more reliably.
5 Ways to View and Manage DAGs in Airflow
See how recent UI updates make Airflow more connected, useable, and observable.
Introducing the Astro Cloud IDE
The Astro Cloud IDE is a notebook-inspired tool for writing data pipelines — no Airflow knowledge required.
What’s New in Apache Airflow 2.5
Airflow’s third significant release of 2022 introduces a redesigned DAG test feature, task annotations, data-dependent scheduling enhancements, and more.
A Short History of DAG Writing
The needs of the data team have evolved since Apache Airflow was open-sourced by Airbnb in 2015, pushing Airflow itself to evolve. We’re excited about how far we’ve come, and have some big ideas about where Airflow and orchestration can go.
Best Practices for Secure Network Connectivity and Authentication in Astro
Get the most out of Astro by incorporating solid connectivity and authentication practices from the start.
3 Ways to Extract Data Lineage with Airflow
Learn how to use Airflow’s operators, custom extractors, and inlet/outlet arguments to send lineage to your data observability tool.
How Astro’s Data Graph Helps Data Engineers Run and Fix Their Pipelines
Learn how Astro can help you understand, communicate, and solve pipeline problems.
How FanDuel Delivers Its Most Complex Data Reports Reliably and Efficiently with Astro
FanDuel moved from open source Apache Airflow to Astro, and engaged with Astronomer's Professional Services team to keep reliably and efficiently delivering data to the whole business as it grows.
OpenLineage: Where It Came from and What Comes Next
Data lineage is having a moment. Julien Le Dem, Co-Founder of OpenLineage and Chief Architect at Astronomer, looks at the drivers behind a notable increase in lineage adoption.
How to Keep Data Quality in Check with Airflow
The Airflow-driven data quality checks that we use at Astronomer ensure bad data is found and fixed quickly.
What’s New in Astro Python SDK 1.1: Data-Driven Scheduling, Dynamic Tasks, and Redshift Support
Learn what the new upgraded Astro Python SDK 1.1 offers to Airflow users.
Orchestration, or How to Become a Data-Driven Company
Learn how Astronomer’s ambitious vision for data orchestration has led to a fully coordinated ecosystem that runs consistently and reliably, with minimal overhead and with full control, visibility, and governance of data operations.
Micropipelines: A Microservice Approach for DAG Authoring in Apache Airflow
Airflow 2.4 lets you break down big, monolithic pipelines — in which long-running tasks can delay time-sensitive ones — into “micropipelines” that let you tune your data ecosystem and make critical data products available on time.
VTEX Achieves Consistency Across Its Data Environments with Astro
Learn how a huge digital commerce platform cut through the complexity and got things running smoothly with the fully managed, Airflow-powered orchestration service.
Expanding Data Access and Exchange Inside a Company
How a technical ecosystem built by Astronomer’s data science team is empowering internal groups to own and share data.
Airflow 2.4 and Data-Driven Scheduling: How a New Feature Is Saving Time at Astronomer
The new data-driven scheduling functionality in Airflow 2.4 eliminates a lot of complexity and toil. See how Astronomer is applying this useful feature.
Astro CLI: The Easiest Way to Install Apache Airflow
Learn all about the Astro CLI — the free, open source tool that makes it easy to install, run, and test Apache Airflow from your command line in under five minutes.
Apache Airflow 2.4 — Everything You Need to Know
Airflow 2.4 is here, with a new datasets feature that augments Airflow with powerful data-driven scheduling capabilities. And much more!
Podcast Spotlight: What Observability Brings to Data Orchestration
Astronomer’s SVP of R&D talks about the “virtuous circle” that combining them creates for business.
Announcing Astro’s HIPAA and PCI-DSS Compliance
Two more reasons that users who need high levels of data security and protection can count on Astro.
How We Track the Growth of Apache Airflow
Find out how we use data to keep track of what’s happening in the Airflow project.
Reimagining Airflow for Data Engineers and Data Scientists with the Astro Python SDK
Introducing the fastest way to get started writing Airflow data pipelines.
Astro Is Now Available on All Major Cloud Providers
Data teams can now run Astro — the data orchestration platform powered by Airflow — on AWS, Azure, and Google Cloud, providing access to the latest Airflow updates, smooth integration of your native services, and more time to focus on your pipelines.
Everything You Should Know About Airflow 2.3’s New Grid View
Airflow 2.3’s new grid view is a compact, intuitive way to visualize complex representations in Airflow’s UI. Today, we present a detailed overview of the feature.
The Astronomer Providers Package — A Better Option for Long-Running Tasks
A deeper dive into this new collection of deferrable operators, hooks, and sensors, which can save money by reducing the resources Airflow consumes.
Red Ventures Brings Reliability and Resilience to Its AI Stack with Astro
Learn how Red Ventures increased its agility — and its developers’ productivity — by using Astronomer’s fully managed Airflow service to orchestrate its workflows.
Ventana Research Names Astronomer as a Finalist for Its Digital Innovation Awards
Astronomer is a finalist for the 15th Annual Ventana Research Digital Innovation Awards, which recognize companies for creating innovative technologies.
Introducing Astro, the Fully Managed Data Orchestration Platform, Powered by Airflow
Astro lets organizations build, run, and observe data pipelines quickly and reliably. Its support for data lineage provides complete visibility into pipelines running across projects, regions, and clouds.
Introducing New Astro CLI Commands to Make DAG Testing Easier
The open-source Astro CLI is the easiest way to develop DAGs with Apache Airflow locally. The new Astro CLI commands we’re announcing today give users a simple, standardized, frustration-free way to test and debug their DAGs and tasks.
To Build or to Buy? DIY Orchestration with Airflow vs. A Fully Managed Service
In this article, we look at a few common reasons organizations consider building their own Airflow infrastructures, and at how a fully managed service can save resources and make you more competitive.
Introducing Astronomer Providers
We’re excited to introduce Astronomer Providers, a set of Airflow 2-licensed providers with async functionality, created and maintained by Astronomer.
Apache Airflow 2.3 — Everything You Need to Know
Dynamic task mapping, a new local executor, an improved grid view… Find out what’s new in Apache Airflow 2.3.
Apache Airflow for Data Scientists
What are the common challenges data scientists face, and how can Apache Airflow help? Today, we explore the role of a data scientist.
10 Best Practices for Modern Data Orchestration with Airflow
Today, we’re identifying best practices that will allow you to stand up, scale, and grow Airflow in support of both operational data integration and modern data orchestration.
Airflow and dbt, Hand in Hand
Whether it’s Cloud or Core, using Airflow and dbt together makes life better for everyone. Learn about all the ways they can be combined, as well as a new dbt Cloud provider.
What Is Data Lineage and Why Does It Matter?
To operate in today’s distributed data ecosystems, you need a complete and up-to-date picture of your environment at all times. Learn how data lineage can help you make sense of your data.
Letter from the CEO: Our Story So Far
On the heels of our Series C and our acquisition of Datakin, Joe Otto reflects on Astronomer’s history, and looks to a future powered by the combination of orchestration, lineage, and observability.
Astronomer Acquires Datakin, the Data Lineage Tool
Joining forces will accelerate our shared goal: To help organizations build and manage reliable data ecosystems that deliver trusted data and drive business-critical decisions.
TechCrunch on Astronomer’s Big News
The site covered our recent acquisition of Datakin and our Series C round.
Apache Airflow for Data Leaders — How to Empower Data Teams
What are the common mistakes data leaders make? What goals should they prioritize? How does Astronomer help them overcome these challenges? We asked expert Steven Hillion, VP of Data at Astronomer.
Airflow Summit 2022 — Join the Airflow Event of the Year!
The biggest community-driven event around Apache Airflow returns May 23–27, 2022.
Apache Airflow at Astronomer—Taking Data Orchestration to the Next Level
Learn how Astronomer drives the Apache Airflow project together with the community.
10 Best Practices for Airflow Users
Discover ten best practices that will help all Apache Airflow users ensure their data pipelines run smoothly and efficiently.
Top Data Management Trends for 2022
We talked to our Airflow experts about what data management trends to look out for in 2022.
Adding Data Quality to DAGs ft. Great Expectations
Adding data quality to DAGs is an iterative process, and Great Expectations is a preferred tool to use for that process.
Astronomer and Uturn Partner to Drive Innovation and Better Business Outcomes
We're excited to announce our partnership with Uturn!
Apache Airflow for Data Engineers—How to Leverage Data Orchestration
How has the role of a data engineer evolved over the years? What are their main responsibilities and how can Airflow help?
How to Select the Best ETL Tool to Integrate With Airflow? Our 3 Picks
Find out if choosing the best ETL tool is easy and which three ETL tools we like to combine with Airflow.
Every Company Nowadays Becomes a Data Company—Interview with Bolke de Bruin
An interview with VP of Enterprise Data Services at Astronomer on everything data and Airflow.
Machine Learning Pipeline Orchestration
A complete guide to orchestrating machine learning in production.
How to Build a Modern Data Stack
Breaking down what a modern data stack means in practice. We discuss four core components, five reasons to set it up, and how to orchestrate it.
Apache Airflow vs. Apache Beam
Apache Airflow or Apache Beam? Or both, working together? Let's have a closer look at two popular data management open source tools.
Democratizing the Data Stack—Airflow for Business Workflows
Learn how Hightouch drives action in marketing & sales teams with Reverse ETL, SQL, and Apache Airflow
Machine Learning Pipelines: Everything You Need to Know
Learn what is the process of building a ML pipeline, what are the steps, and how to do it with Airflow and Astronomer.
What is Reverse ETL and How Can It Improve Data Flow?
Find out what is reverse ETL and how to use Census and Airflow together to improve data orchestration.
Airflow at BBC—Data Orchestration Solution in Media
A conversation with the BBC's Principal Data Engineer about how Apache Airflow helps them deliver personalized experiences to the audience.
Everything You Need to Know About Apache Airflow 2.2.0
It's alive! Discover the major Airflow 2.2.0 features including customizable timetables, deferrable tasks, Airflow standalone CLI command, and many more.
Big Data Architecture: Core Components, Use Cases, and Limitations
Is Big Data Architecture the answer to major business problems, or just a crucial piece of a bigger puzzle? Discover our insights on the topic in this short blog post!
The Future of Banking: How Can Apache Airflow Help?
Learn what are the challenges of the banking industry today, and how Apache Airflow can help with digital transformation.
Apache NiFi vs. Apache Airflow
Overview and comparison study of two popular ETL tools for managing the golden asset of most organizations: data. Can these two be compared at all?
Airflow at Wise: Data Orchestrator in Machine Learning
A talk with Alexandra Abbas—a Machine Learning Engineer at Wise—about how they leverage Apache Airflow in their ML initiatives.
How to Build an ETL Process?
Extract, transform, load. Discover the vital steps and methods of building an ETL process for your business.
Data Silos: What Are They and How to Fix Them?
Everything you need to know about data silos – how they influence your business, where they come from, and how to fix them.
Airflow at Societe Generale: Data Orchestration Solution in Banking
A conversation with Societe Generale about their Airflow implementation and development of the data orchestration solution.
Data Pipeline: Components, Types, and Best Practices
Why are data pipelines an absolute necessity for any smart, data-driven business? Learn the basics and follow our best practices.
Building a Scalable Analytics Architecture With Airflow and dbt: Part 3
Learn how to build a scalable analytics architecture with Apache Airflow and dbt – in the third and final part of our series.
What Is Data Orchestration and Why Is It Essential for Business
Discover what data orchestration is, learn the most significant pain points it addresses, and find out how to help your business grow.
Airflow Summit 2021 Highlights
Learn about the biggest community-driven event around Apache Airflow 2021 and the power of the Airflow community.
How Data Pipelines Drive Improved Sales in E-commerce
Our Field CTO, Viraj Parekh, shares insights on how sales and marketing operations in e-commerce can benefit from running functional data pipelines.
Airflow and Ray: a Data Science Story
We're pleased to announce a Ray provider for Apache Airflow that allows users to transform their Airflow DAGs into scalable machine learning pipelines.
Everything You Need to Know About the Airflow Summit 2021
Join Airflow Summit 2021 – a free online conference for the worldwide community of developers and users of Apache Airflow.
Validate Your Apache Airflow Skills With the Astronomer Certification
Boost your career and learn to run a data pipeline by getting Apache Airflow certified with the Astronomer Certification for Apache Airflow Fundamentals.
The New KubernetesExecutor
We give you a tour of the new features in the KubernetesExecutor 2.0. Spoiler alert – it's faster, more flexible, and easier to understand.
Announcing the Astronomer Registry
Today, we're excited to release our discovery and distribution hub for Apache Airflow integrations.
Airflow 2.0 TaskFlow API and Its Features
Learn how the TaskFlow API in Airflow 2.0 enables a better DAG authoring experience.
Secrets Management in Airflow 2.0
Secrets are sensitive information that are used as part of your DAG. Here are some best practices for managing them in Apache Airflow 2.0.
Change Data Capture With Apache Airflow: Part 1
Implementing production-grade change data capture in near real-time on Google CloudSQL with Apache Airflow.
Building a Scalable Analytics Architecture With Airflow and dbt: Part 2
Now that we have these DAGs running locally and built from our dbt `manifest.json` file, the natural next step is to evaluate how these should look in a production context.
Building a Scalable Analytics Architecture With Airflow and dbt
Implementing an ideal development experience at the intersection of two popular open-source tools, written in collaboration with our friends at Updater.
The Airflow 2.0 Scheduler
A technical deep-dive into Apache Airflow's refactored Scheduler, now significantly faster and ready for scale.
A Great Expectations Provider for Apache Airflow
We're pleased to announce an official integration that allows users to leverage Great Expectations natively in their DAGs.
Introducing Airflow 2.0
A breakdown of the major features incorporated in Apache Airflow 2.0, including a refactored, highly-available Scheduler, over 30 UI/UX improvements, a new REST API and much more.
Introducing KEDA for Airflow
Using KEDA (Kubernetes Event-Driven Autoscaler), we've developed a robust method to scale Apache Airflow workers to be faster and more versatile than any previous architecture.
Profiling the Airflow Scheduler
Ash explains how he's been benchmarking and profiling the Airflow scheduler using py-spy and Flame Graphs.
Airflow continues to win due to an active and expanding community, and very deep, proven functionality.
The Next Generation of Astronomer Cloud
A new release of Astronomer Cloud built to support our latest features and designed to be a first step towards multi-cloud and multi-region support.
Announcing v0.10 of the Astronomer platform.
7 Common Errors to Check When Debugging Airflow DAGs
Tasks not running? DAG stuck? Logs nowhere to be found? We’ve been there. Here’s a list of common snags and some corresponding fixes to consider when you’re debugging your Airflow deployment.
Airflow Design Principles: Multi-tenant vs. Monolithic Architecture
Why we decided that a multi-tenant Airflow architecture would be the most efficient and reliable way to run our DAGs.
Astronomer v0.8.0 Release Notes
Release notes for v0.8 of the Astronomer Platform.
Astronomer on Astronomer: Loading Thousands of Files Into Redshift With Apache Airflow
Here's the story of why we chose Airflow, how we use it, what we've learned, and what we're building to make it better.
Astronomer v0.7.0 Release Notes
Release notes covering the features released with v0.7.0 of the Astronomer platform.
Astronomer v0.6.0 Release
Release notes for v0.6.0 of the Astronomer platform.
Astronomer v0.5.0 Release
Release notes from our recent platform update to v0.5.0.
Astronomer v0.4.1 Release
Release notes on v0.4.1 of the Astronomer platform.
How the Apache Airflow Project Will Change
Discussing the potential future direction of the Apache Airflow project.
Astronomer v0.3.2 Release
A rundown of features and product improvements since our v0.3.0 release.
Announcing Astronomer v0.3
Announcing the latest iteration of our Airflow offering
Astronomer Enterprise Edition 0.2.0
Announcing the latest iteration of our Airflow offering.
Announcing the Astronomer Platform, a Managed Service for Apache Airflow
Managed Apache Airflow for complex ETL orchestration.
Announcing Astronomer Enterprise Edition
Our latest platform deployable in your cloud.
Announcing Astronomer SpaceCamp
The fastest way to ramp up your data team.
Announcing The Airflow Podcast
A podcast focused on sharing the open-source community's knowledge about Apache Airflow.
An Airflow Story: Cleaning and Visualizing Our Github Data
How we used Airflow to clean up a Github mess.
Improving Government Services With Apache Airflow: a Q&A With San Diego’s Chief Data Officer
Applying Airflow in the public sector to operationalize public data.
From Behavioral Analytics to Data Science With Astronomer
Being a stand-out company requires elevating product innovation within the organization and making sure that innovation isn’t reactive, but predictive.
Using Apache Airflow to Create Data Infrastructure in the Public Sector
When ARGO began exploring the technology required to build, operate, and maintain data infrastructure in the public sector, it’s no surprise they landed on Apache Airflow.
Data Formats 101
Business analysts generally encounter four main formats of data: JSON, XML, CSV, and TSV. So what are these types and why would we use them?
Why Every Data Scientist Needs a Data Engineer
The data scientist, the sexiest role of the 21st century, isn't actually very sexy. But it could be.
What Exactly Is a DAG?
What exactly is a DAG and what does it tell us that the term “data pipeline” can't?
Data Engineering Platform Astronomer Closes $3.5M Financing
The Astronomer platform collects, processes and unifies data, allowing organizations to scale analytics, data science and insights.
Normalizing Data for Warehouse Centralization
Knowing the options for storing data will help you make the right decisions for your company when you’re ready to take this step.
Apache Airflow and the Future of Data Engineering: A Q&A with Maxime Beauchemin
I reached out to Max about doing an interview post, and to my delight, he agreed. Here are thoughtful answers to questions about Airflow and data engineering.
Our Open Source Philosophy
The world where technology is open sourced is the world we want to live in. Our CTO explains why.
Why Is My Data Playing Hard to Get?
When we talk about "hard to reach" data, what kind of data are we talking about, and why exactly is it so hard to access, organize, and store?
Airflow at Astronomer
To extract and monitor all types of data pipelines, we needed a unified scheduling system. Airflow was our answer— and a whole lot more.
Our Unique Path to Raising $2M Seed in the Midwest
Our path to raising $2M is a series of short stories with some amazing protagonists.
Press Release: Astronomer Announces Seed Financing
Press Release: Astronomer Closes $1.9M in Seed Financing
Lessons Learned Writing Data Pipelines
I know first-hand how challenging data pipelines can be. Here's a peek under the hood of Astronomer at what makes our growing platform unique.
Why We Built Our Data Platform on AWS, and Why We Rebuilt It With Open Source
As Astronomer’s CTO, I’m going to chronicle our journey, from a technical perspective, as we grow our platform and home in on how to meet our users’ real needs.
An Almost Acquisition Story
Coming out of AngelPad’s 2015 Demo Day, we found ourselves vacillating between an acquisition and Series A, though we were arguably too early for either.
Announcing Astronomer v0.9
Release notes for v0.9 of the Astronomer Platform
A Logo Story
Astronomer's Head of Design, Chris Hendrixson, explains how he created the design aesthetic to encompass data, futurism, and a little bit of fun.
Setting Up Your Redshift Cluster
Redshift is popular but you still need to know what you''re doing when spinning up your first cluster. In this tutorial, we walk you through the process.
When Should You Start to Warehouse Your Data?
These days, startups want to be data-driven, and web and mobile apps can generate quite a bit of data.
Why We Drove to NY and Back Over the Past 48 Hours for a 15-Minute Meeting
The Astronomer team drove from Cincinnati, OH to New York, NY for a fifteen minute meeting with the top accelerator in the world. Now... Why did we do that?