At Datacoral, we spend a lot time thinking about security and building our product in such a way that our customers have complete control over their data and credentials inside of their Amazon Web Services (AWS) accounts. This has led to our unique architecture that allows for secure data ingestion from all kinds of datasources into the data warehouse of choice for our customers.
Datacoral’s software deployment model involves installing our software in our customer’s VPC. Datacoral doesn’t have access to any customer data and our customers’ DevOps teams can have full control over all parts of their data infrastructure stack. While this already provides unprecedented levels of data security in today’s world of SaaS products, we offer the following lessons from our years of working with AWS for even further improved accountability and auditability. We believe that whether you use Datacoral or not, these can help put the right safeguards in place to prevent and triage data breaches and security incidents.
The first step in enabling governance, compliance, operational auditing, and risk auditing is having a complete and robust Audit Log. So, log all actions performed by systems and users within your infrastructure in a robust manner with full attribution – like the identity of the user/system, the location where the action was initiated, the service where the action was performed, and the details of the action performed.
Where to write the Audit Log?
Every auditing technique described below that produces logs should direct its logs to an S3 bucket that lives in a separate AWS account. This AWS account should only be accessible to admins and legal. In addition, setup the right policies on the bucket so that objects can only be written or read, never deleted.
Monitor User Activity and API Usage in AWS
You can setup AWS CloudTrail to monitor every single action taken within your AWS account. These include actions from the AWS Management Console, AWS SDKs, command line tools and other AWS Services. CloudTrail can simplify compliance audits, help troubleshoot security incidents, provide visibility into user activity and allow for automatic responses to specific activity. Read our documentation for more information on how to set this up.
Data Warehouse Activity Logging
Amazon Redshift
Amazon Redshift allows database auditing for security and troubleshooting purposes. The logs files that are generated include:
- Connection logs: authentication attempts, connections and disconnections
- User logs: information about changes to database user definitions
- User activity logs: includes each query before it is run on the database
Read our documentation for information on how to set this up.
AWS Athena
Athena is well integrated into CloudTrail. All Athena actions are logged by CloudTrail. See the Athena documentation for details of the structure of the Athena CloudTrail logs.
Snowflake
Snowflake has a more integrated audit logging mechanism where the logs are queryable within Snowflake itself. However, the logs are available only for the past 365 days (1 year). If longer retention is needed, Datacoral offers an S3 Publisher that can extract the audit logs from Snowflake and write them to the audit S3 bucket.
AWS S3 Server Access Logging
Amazon S3 allows logging to provide a detailed record for all requests made to a bucket. Other than security and access audits, it can also help understand customer usage and AWS billing.
Read our documentation for steps on how to set this up.
VPC Flow Logs
VPC Flow Logs is a feature that allows capturing information about all IP traffic going to and from network interfaces in your VPC. These logs can help diagnose security group rules, monitor traffic reaching resources within your VPCs and determine traffic to and from specific network interfaces.
Read our documentation for further information.
Resource Tagging
Resource tags are a way to assign metadata to any and all resources that are created within your AWS account. These can help manage, identify, search for and organize resources by different criteria. Common reasons to tag resources are for technical categorization, automation, separation of business units and security purposes. Follow the Tagging Best Practices for your AWS resources. Datacoral automatically tags all resources that we created with the tag datacoral
.
Conclusion
For the most part, auditing is an afterthought in any company. But, what we have found is that with public clouds like AWS offering very easy ways to set things up, it is straightforward for even small companies to start off on the right track by following a few simple steps. Our customers are getting the benefit of our experience since we help them get to full “auditability” of their AWS account as part of installing Datacoral!
Learn more by writing us at hello@datacoral.co or sign up for a free trial below.