Data Science-Plattform für pharmazeutische F&E-Prozesse
Über scinteco gmbh
Scinteco ist ein 2011 gegründetes österreichisches Softwareunternehmen mit Sitz in Wien, das sich auf die Konzeption und Entwicklung von IT-Lösungen, Anwendungen und Produkten speziell für die Pharmaindustrie spezialisiert hat.
Each industry needs continuous improvement and so does the pharmaceutical sector for which scinteco gmbh is providing cutting-edge solutions to support research and development (R&D) processes. To provide several deployment options of scinteco’s platform, also public providers and especially AWS has to be part of their deployment portfolio. This would enable them to serve customers who are following a cloud-first approach or are aiming to move to AWS. Therefore, there was a need to prepare an AWS cloud environment which was capable of running scinteco’s platform and taking advantage of the cloud’s flexibility, its scaling and security features.
As a leader and innovator, it is crucial that the platform and its infrastructure components can be set up effortlessly in diverse environments with the simple push of a button to minimize lead times. The solution should be designed for ease of use, enabling quick setup and deployment without the need for extensive technical expertise.
Research and development (R&D) processes often use internal corporate data and secrets. Therefore, it is a must to cover several security aspects from encryption in transit and at rest to have all the data only transferred internally and never transmitting anything via the public internet to avoid leaks.
True innovation involves numerous attempts, failures, and restarts. Releasing your Minimum Viable Product (MVP) shouldn't necessitate substantial infrastructure expenses. Failures should be low-cost, allowing for the scalability of successes. Conventional IT infrastructure demands considerable initial investment, and modifying key architectural elements further increases these costs. Additionally, addressing overarching issues such as security and DDoS protection involves initial investments.
Based on scinteco’s platform requirements the following architecture was built with robustness and security in mind, where all traffic is remaining within the AWS environment and the only entrypoint is given using a VPN connection.
Compute: The main application logic of the platform is split into three different services where each service is running on a separate EC2 instance backed by an auto-scaling group to enable automatic relaunch in case an instance goes down. Additionally, the platform consists of an HPC (high performance computing) cluster running slurm, which consists of one Head Node and several compute nodes also running on EC2.
Database and Storage: All database and storage services have been chosen with high availability and scalability in mind to be future-proof. The main part as well as the HPC cluster use Aurora to store its stateful application data, and it uses Opensearch to index certain data and provide search capabilities. Furthermore, S3 is used as central file storage and EFS is used to exchange data between one main service and the HPC cluster.
Logging and Monitoring: Logs from the all EC2 instances are sent to CloudWatch using the CloudWatch agent, and they are also sent to a central Grafana instance. Additionally, CloudWatch metrics are used to check the healthiness of the whole solution and its individual components.
Security: The whole infrastructure is set up exclusively in private subnets and utilizes VPC endpoints for connecting to other AWS services. Essential data storage elements such as Aurora, Opensearch, EFS, and S3 are secured using KMS encryption with customer-managed keys. Additionally, data transmission is encrypted and the certificate is issued by the AWS Certificate Manager. Access to AWS services is regulated through IAM roles, following the principle of least privilege. Moreover, static credentials are safely kept inside AWS Secrets Manager.
Deployment and Orchestration: The whole AWS infrastructure is implemented using the Infrastructure as Code (IaC) tool Terraform. The setup is highly configurable to easily allow the deployment to different environments. The rollout of the whole infrastructure as well as of changes can be done by a single Terraform execution.
The provided solution enables scinteco to easily deploy the whole infrastructure in a secure manner internally as well as for customers. Some key achievements are described below:
- The lead time has been heavily decreased because the duration of the whole setup process (from a blank AWS account until the platform is fully up and running) takes less than an hour.
- The self-healing mechanism using auto-scaling groups has significantly reduced manual maintenance effort, and so the operations team can focus on other topics while the auto-scaling group does its job.
- The HPC cluster can make use of dynamic scaling which means launching and configuring new EC2 instances whenever needed for running a job.
Infrastructure as Code (IaC)
- The entire infrastructure required to run the scinteco platform can be established in under an hour with a simple button push.
- Making configuration adjustments is streamlined through the modification of central configuration files.
Scalability and Flexibility
- The solution is prepared to scale all individual parts of the infrastructure.
- High flexibility for the HPC setup to use the best fitting EC2 instances for the submitted jobs to reach high efficiency.
- Data is encrypted at rest and in-transit.
- No data is transferred via the public internet, so all data remains private.
About the Partner
ByteSource Technology Consulting GmbH based in Vienna is one of the leading experts in the DACH region for AWS, Atlassian, AI, DevOps and agile software development as well as technical consulting. As the largest Atlassian Platinum Partner and AWS Advanced Tier Services Partner in Austria, ByteSource proves their expertise in scaling the Atlassian toolset and agile transformation on the innovative Data Center platform. With a focus on DevOps & Cloud Journeys as well as migration and advise, ByteSource realizes large-scale projects up to three times faster than IT companies with conventional approaches. The integration of generative AI underscores the commitment to deliver enterprise-level solutions with exceptional effectiveness.