Tips and Tricks

Our top 5 Data Lessons from JPM

Post by
Our top 5 Data Lessons from JPM

A common question at JPM was how to build a secure system for managing scientific data. How to integrate your tooling with data producers and consumers?

So, we are jotting down our 5 data lessons after meeting with Biotech executives.

  1. AI provides some great tooling, but the next wave is the integration of the tools to drive outcomes. Why? Because if the data is not usable, what’s the point?
  2. People are tired of hearing about ELN, LIMS, and SDMS. They just want tools that help them build their data infrastructure and easily integrate with their own tools and instruments. The industry is actively seeking alternative tools that can easily integrate into their ecosystem.
  3. Industry is building in-house fine-tuned LLM infrastructure. However, the primary concern is ensuring the deployment is secure.
  4. Data volume has greatly increased, particularly in molecular diagnostics, High Throughput Screening, and Next Generation Sequencing workflows. Therefore, industry prefers tooling that can handle substantial volumes of data and compute. Traditional ELN and LIMS are not fit for solving this problem.
  5. To build an AI-first tech bio company, you need an API-first data infrastructure, so there is no dependency on one tool. The top priority when purchasing tools from external sources should be the ability to easily switch between them. Bio companies seek secure tools to connect with their in-house LLM deployment and scale effectively.

Thinking to build an AI-first data infrastructure for bio? Here are some tips:

  1. Don’t Implement a classic scientific data management system (SDMS), ELN or LIMS systems. You should search for alternative modern ELN, LIMS, and SDMS systems similar to Scispot.
  2. These systems will help you avoid relying solely on one technology. Additionally, they will make it easy for you to transfer data between different systems. Here, 'alternative' refers to systems that prioritize data import and export over dependence on a single tool.
  3. Repurpose your laboratory informatics application (such as alternative ELN/LIMS/MES/LES) as the data layer. Your laboratory data is crucial for building your data strategy.
  4. Build your own fine-tuned Large Language Models, and make sure you build your Intellectual Property in-house.
  5. When deciding whether to buy or build, prioritize API-first architecture ensuring security, compliance, and data integration.

Here is an image that we made along with one of our Scispot's customer who highlighted their ideal data flow.

Data Infrastructure for an AI-First Biotech

Now the question is, do we still need ELN, LIMS and SDMS to connect data producers and consumers?

Industry desperately needs alt systems. The tooling built ground up with API-first architecture.

Enterprise biopharma companies have two goals in 2024:

  1. Accelarate & improve scientific outcomes
  2. Innovate to produce Intellectual Property

Enterprise Biotech are looking for applications such as Scispot GLUE to make that integration seamless.

To achieve the above two goals, industry needs to invest in the following systems:

  1. alt-SDMS is a modern software that handles scientific data. It collects, arranges, and saves data from lab tools and apps while keeping track of its origin. The alt-SDMS must possess a secure Application Programming Interface (API).
  2. alt-ELN:  A digital tool for storing, organizing, and sharing research data in flexible, experimental settings. Alternative ELN must possess a secure end point to connect with external systems.
  3. alt-LIMS: A Laboratory Information Management System (LIMS) is software that helps manage a laboratory's operations and data accuracy. It can integrate with ELN software for better tracking and management. Alternative LIMS must be able to integrate seamlessly with instruments.
  4. Computational tools in biology use computers to analyze data, make models, and simulate biological systems.
  5. Easily locate ontologies through visual search. Track and discover your chain of custody while conducting experiments in the lab.
  6. LLMs for Biology are math tools used in biology for tasks like language processing and sequence processing. They help with protein design, drug discovery, and genetic research.
CRO management, CRO data


As you build or buy these tools, there is a significant investment to ensure:

1. Integrating Tools for Smooth Data Flow: In biology R&D labs, it's crucial to connect different software and hardware tools seamlessly. This link allows for seamless data transfer and effective processing.

Example: Imagine a lab that studies genes. High-tech machines sequence genes, while computer programs analyze the resulting data.

Linking these instruments enables swift transmission, examination, and documentation of data from the gene sequencer. This makes the process more efficient. Connecting these instruments allows for rapid data transmission, examination, and documentation from the gene sequencer, enhancing productivity.

2. Uniform Data Language: A standard data dictionary is like having a common language in the lab. It ensures everyone records and interprets data in the same way, which is important for accuracy and consistency.

In a bacteria lab, this dictionary tells us how to measure bacteria growth, including the units to use. This way, researchers can compare data from different experiments accurately.

3. Trusted Data Sources: Knowing which data to trust is key in biology R&D labs. This "source of truth" is the main data that scientists rely on for their conclusions.

In a lab studying proteins, the most reliable data for understanding them comes from experiments like mass spectrometry.

4. Tracking Samples in alt-LIMS: A lab using alt-LIMS can track the entire process of working with cell cultures. This makes sure that all information about the sample's journey is recorded accurately and can be easily found.

5. Understanding Different Data Types: In pharmaceutical studies, primary data could be results from blood examinations, ancillary data might be the timestamps of each test, and the information for analysis could be charts illustrating the variation in drug concentration over a period.

To learn about making tools for your in-house data infrastructure, contact satya@scispot.io.

Sign up for Scispot

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.