Skip to content

A Data Scientist Blog

Best practices per AWS CDK L3 constructs

Nell'utilizzo avanzato di AWS CDK, si realizza prima o poi l'esigenza di centralizzare alcune logiche e risorse in maniera da poterle riutilizzare in maniera rapida, riducendo il codice boilerplate. Anche se una prima analisi potrebbe suggerire che la soluzione sia l'implementazione di una "casalinga" factory - pythonica sì, ma non conforme alle best practices di CDK - la risposta probabilmente più corretta potrebbe riguardare l'implementazione di un costrutto L3, descritto come:

designed to help you complete common tasks in AWS, often involving multiple kinds of resources.

A comparison between AWS databases

Main databases types:

  • Relational: data are stored in tabular form (rows and columns), where each row represents a unique record. Tables can be put in relation with each other through joins and queried via SQL;
  • Key-value: non-relational database where each record stored as a unique key with its associated value, resembling a dictionary-like structure;
  • Document: semi-structured and hierarchical databases for catalogs and content management systems, often stored as JSON;
  • Graph: the way the data are stored is graph-based, with nodes and edges connecting each data source with the others;
  • Time-series: database optimized for records which indices are timestamps.

AWS CodeBuild local testing

Suppose you have a CodeBuild project triggered by a push on a given branch of a linked CodeCommit repo. If the build is particularly heavy, you might want to ensure its correctness before an actual commit to the related repo - for example, you might be interested in testing the build process specified in buildspec.yml locally.

How to display AWS CloudWatch logs in Streamlit

Let's dive in the following scenario:

  • we have some job/task running on AWS
  • we have already built a Streamlit frontend to launch jobs
  • we want to monitor AWS CloudWatch logs generated by the job execution
  • we don't want to neither switch from our Streamlit frontend to AWS Console, nor become crazy in following right log groups/streams to track our job

A possible custom solution is presented below.