As businesses learn how to make up-to-the-minute, data-driven decisions, their data teams face increasing scope demands. The challenges these data teams face include getting access to a plethora of data sources; understanding the business value that resides in them; managing and moving large data sets; and finding the time to build meaningful dashboards and algorithms that drive business outcomes. Adding more data experts to the team may spread the load, but does not eliminate the myriad distractions from their core role: data analytics.
Furthermore, data teams are often working with privileged user data — think patient data in healthcare, credit card transactions in finance, or employee data in HR. The business-critical work of the data team needs to be kept secure at all times and must remain within the organization’s control. This can place immense scrutiny on data teams from their security counterparts if they wish to use a third-party analytics tool. Data teams might then decide between running and maintaining custom, in-house tooling, or working with on-prem software. Aside from the cost of building or buying them, in-house and on-prem options are often difficult to support, making them a major investment decision and a risk to overall agility. Such teams are quickly bogged down with systems administration, which distracts them from providing the most value to their customers. And in the case of a production data outage, it’s all-hands-on-deck.
That’s where we come in.
Datacoral’s Platform and Architecture
Datacoral is a data engineering platform that provides connectors for 80+ sources that help customers replicate data from these varied sources into their data warehouse of choice. By using our platform, our customers are able to spend more time understanding their data, instead of being distracted by data plumbing.
In a previous blog post, we explored how Datacoral leverages a serverless single-tenant architecture (per customer) as opposed to a multi-tenant architecture. Since Datacoral’s connectors are a set of serverless microservices deployed in our customer’s Virtual Private Clouds (VPCs), this leads them to scale up and down with data volumes for each customer separately, while no data (encrypted at rest using customer-owned keys) leaves their environment. This has allowed us to be a scalable and secure data pipeline solution for our customers.
Datacoral provides additional security by ensuring that all resources that run in our customers’ environment assume the minimum privileges required for running successfully (principle of least privilege). This has meant that our product guides our customers into creating multiple IAM roles for Datacoral resources, rather than a single role with blanket (or admin) privileges.
Our VPC peering architecture ensures your production data never touches the public internet.
Separately, you can read our guiding philosophy on how to build a modern data stack (hint: be metadata first!). This philosophy has led to our platform today, which has replicated hundreds of terabytes of data for our customers.
Best of Both Worlds: Fully Managed and Secure Pipelines that just work
We recognize that data teams need to unlock data insights and not become buried in software, database, and pipeline administration. For many on-prem or in-house tooling, software deployment and support can quickly become a full-time job involving monitoring, alerting, upgrades, and integrations. In addition to the scalability and security of our platform, we want to share what it means for Datacoral to be a managed service and why we aim to alleviate our customers’ pain points.
First, software upgrades: We want to ensure that our customers experience the fast pace at which we build new features. This means that our software needs to be upgraded regularly (once every couple weeks). For most on-prem software, upgrades are fairly infrequent — but painful when they do happen — and require downtime. In our case, Datacoral automatically pushes software upgrades using cross-account roles which involve zero downtime (due to the serverless nature of our services, which don’t involve large provisioned clusters of any kind). This means that there are no headaches at upgrade time for customers.
Next, monitoring and alerting: The Datacoral installation itself automatically monitors different metrics:
- Source systems (when applicable): PostgreSQL replication slots when using the PostgreSQL Change Data Capture (CDC) connector
- AWS services: AWS Lambda, Containers, Amazon Redshift, S3, DynamoDB tables
- Data flow metrics: connector status, data freshness, and data quality metrics
These monitoring systems alert us and our customers of any unusual behavior during the day-to-day workings of the data pipelines. If they wish, customers can set up customized Slack alerts for critical pipelines that power specific tables. This further reduces work that the data team and operations team do to maintain their data pipelines.
Finally, Datacoral allows customers to integrate easily with our pipelines by subscribing to our data events. This allows data teams to trigger their downstream workloads directly through our events, which decreases the time they spend building and maintaining internal tooling.
Case Study: Fully-managed Replication Slots in PostgreSQL CDC Connector
Datacoral shows its superpowers in managing replication slots with our Change Data Capture (CDC) Connector for PostgreSQL. These are critical components of the PostgreSQL database and a CDC connector reads data off of a replication slot before writing this data into a data warehouse.
The main question is: Who owns and manages the replication slot? This is important because if, for any reason, the connector stops, the replication slot can increase in size and bring down the database, which has massive consequences for our customers’ businesses. Most CDC vendors tell their customers to “manage” (monitor, alert, create, delete) replication slots on their own. This means that teams have to set aside operational resources specifically for this activity.
At Datacoral, we remove this burden from our customers and own the responsibility for the replication slots. As soon our monitoring detects an anomaly, a notification goes out to our team and to our customer alerting them to the situation (see image below for an example alert). Following this, the Datacoral connector recovers by following the appropriate steps in one of two ways: Either by reconnecting to the database after a delay and continuing to read from the replication slot, or by recreating the replication slot and running an automated partial or full historical sync to fetch any missing data.
Example Slack alert for our operations team when PG replication slot size exceeds a threshold
Your on-call Data Engineering team, powered by Datacoral
Datacoral provides a managed data engineering platform with complete data security in our customer’s cloud. We know how critical data pipelines are for the analytics work that our customers do, which is why we are obsessed with providing a worry-free experience. While our product handles most anomalous situations automatically, in the rare case when pipelines do break, our support team is always available to help via phone or chat. You can read about how we responded to get our customer’s data pipelines up and running after a severe AWS outage last Thanksgiving.
Datacoral offers much more than observable, fully managed, and secure data pipelines; we offer data teams the freedom to do their best work by not having to worry about schema change tracking, data synchronization, and data connectors. Our serverless architecture allows our customers to scale through tremendous growth without the need for data experts to spend half their week on DevOps work. When services go offline or third-party integrations need to change, we are there to keep our customers online, healthy, and ready to grow.