By continuing to browse this website, you agree to our use of cookies. Learn more at the Privacy Policy page.
Contact Us
Contact Us
Technology stack

Build a robust and scalable AI & data engineering stack

Our multi-layer AI & data engineering tech stack helps teams manage and analyze data effectively.

Build real-time engineering data platforms based on tried-and-tested data warehouses, data lakes, data integration tools, data visualization, BI, and governance software.

Image (5)

Programming languages and ML & AI frameworks

icons8 python

Python

Programming language that lets you work more quickly and integrate your systems more effectively

icons8 javascript

Javascript

Lightweight interpreted programming language with first-class functions

icons8 django

Django

High-level Python web framework

fastapi (1)

FastAPI

Fast, web framework for building APIs with Python

Flask

Flask

Lightweight WSGI web application framework

sktime logo

sktime

A unified framework for machine learning with time series

LightGBM (1)

LightGBM

A gradient boosting framework based on tree-based learning algorithms

Llamaindex (1)

Llamaindex

A simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data

Libraries

PyTorch

PyTorch

A ML library based on the Torch library, used for applications such as computer vision and natural language processing

Keras

Keras

An open-source library that provides a Python interface for artificial neural networks

Prophet

Prophet

A procedure for forecasting time series data based on an additive model

The Microsoft Cognitive Toolkit

The Microsoft Cognitive Toolkit

A unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph

FastAI

FastAI

A deep learning library which provides high-level components

tensorflow

Tensorflow

A software library for machine learning and artificial intelligence

scikit learn

scikit-learn

Free and open-source machine learning library for the Python programming language

XGBoost

XGBoost

A scalable, distributed gradient-boosted decision tree (GBDT) machine learning library

Sentence Transformers

Sentence Transformers

The Python module for accessing, using, and training state-of-the-art text and image embedding models

opencv

OpenCV

The world's biggest computer vision library

BigDL

BigDL

A distributed deep learning library for Apache Spark

Horovod

Horovod

A distributed deep learning training framework for PyTorch, TensorFlow, Keras and Apache MXNet

seaborn 1

Seaborn

A Python data visualization library based on matplotlib

Pandas

Pandas

A fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language

tableau

Tableau

A Python library for the Tableau Server REST API

Facebook AI Similarity Search (Faiss)

Facebook AI Similarity Search (Faiss)

A library for efficient similarity search

Transform your data with proven tools and expertise

Free consultation

Natural language processing

NLTK :: Natural Language Toolkit

NLTK :: Natural Language Toolkit

A suite of open source Python modules, data sets, and tutorials supporting research and development

SpaCy

SpaCy

A free open-source library for Natural Language Processing in Python

gensim

Gensim

A Python library for topic modelling, document indexing and similarity retrieval with large corpora

fastText

fastText

Library for efficient text classification and representation learning

Computer vision

dete

Detectron2

A Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms

torchvision

Torchvision

A package of popular datasets, model architectures, and common image transformations for computer vision

Screen Shot 2021 02 08 at 7.58.33 PM

Mmdetection

An open source object detection toolbox based on PyTorch

pillow python

Pillow

A free and open-source additional library for the Python programming language that adds support for opening, manipulating, and saving many different image file formats

Albumentations

Albumentations

A comprehensive, high-performance framework for augmenting images to improve machine learning models

Data preprocessing and ETL

Apache Spark

PySpark

Enables you to perform real-time, large-scale data processing in a distributed environment using Python

polars

Polars

An open-source library for data manipulation, known for being one of the fastest data processing solutions on a single machine

dask

Dask

A flexible parallel computing library for analytics

Data analysis & visualization

numpy 1

NumPy

A Python library used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices

Plotly

Plotly

For an easy transition from Python notebooks to AI-powered production data apps

Bokeh

Bokeh

An interactive visualization library for modern web browsers

Altair

Altair

A declarative visualization library for Python

Handle Big Data with Xenoss and tools designed for enterprise scalability

triangle decor

MLOps

MLflow

MLflow

ML and GenAI made simple. Build better models and generative AI apps on a unified, end-to-end, open source MLOps platform.

Data Version Control (DVC)

Data Version Control (DVC)

Create pipelines that connect your versioned datasets, code, and models together for effective experiment tracking the GitOps way.

Kubeflow (1)

Kubeflow

Kubeflow makes artificial intelligence and machine learning simple, portable, and scalable

TensorBoard

TensorBoard

A tool for providing the measurements and visualizations needed during the machine learning workflow

Weights & Biases (W&B) (1)

Weights & Biases (W&B)

The AI developer platform, with tools for training models, fine-tuning models, and leveraging foundation models.

ClearML

ClearML

Designed for the most complex, demanding environments and novel use cases

NannyML

NannyML

An open-source python library for estimating post-deployment model performance

Comet.ml

Comet.ml

An end-to-end model evaluation platform for AI developers, with LLM evaluations, experiment tracking

Cloud platforms

AWS

Amazon Sage Maker

A cloud-based machine-learning platform that allows the creation, training, and deployment of machine-learning (ML) models

Amazon bedrock

Amazon Bedrock

A platform to simplify the building of generative AI apps

amazon CodeGuru

Amazon CodeGuru

Helps improving code quality and automate code reviews by scanning and profiling your Java and Python applications

Amazon Forecast

Amazon Forecast

A fully managed time-series forecasting service that uses the same machine learning technology used at Amazon.com

Azure Machine Learning

Azure Machine Learning

A cloud service for accelerating and managing the machine learning (ML) project lifecycle

Google Vertex AI

Google Vertex AI

Vertex AI is a fully-managed, unified AI development platform for building and using generative AI

Big data technologies

Apache Hadoop

Apache Hadoop

A framework that allows for the distributed processing of large data sets across clusters of computers

Apache Spark

Apache Spark

A multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters

Apache Kafka

Apache Kafka

An open-source and distributed event store and stream-processing platform

Database & Vector Database

Qdrant

Qdrant

An enterprise-ready, high-performance, massive-scale Vector Database available as open-source, cloud, and managed on-premise solution

Milvus

Milvus

The high-performance vector database

Pinecone 2

Pinecone

The vector database to build knowledgeable AI

MongoDB (1)

MongoDB

An integrated suite of data services centered around a cloud database designed to accelerate and simplify how you build with data

Weaviate

Weaviate

An open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering

Enterprise innovation starts with the right data engineering stack

Schedule your free consultation

LLM Models

GPT-4o

GPT-4o

A multilingual, multimodal generative pre-trained transformer developed by OpenA

Facebook AI Similarity Search (Faiss)

Llama 3

A new state-of-the-art for LLM models

Palm 2

PaLM-2

A next generation language model with improved multilingual, reasoning and coding capabilities

Claude

Claude

A next generation AI assistant built by Anthropic and trained to be safe, accurate, and secure

GPT-4o

DALL·E 2

An AI system that can create realistic images and art from a description in natural language

Stability AI

Stability AI

An open generative AI video model based on the image model Stable Diffusion

Microsoft Phi 2

Phi-2

A transformer-based model with a next-word prediction objective

Gemini

Google Gemini

A generative artificial intelligence chatbot developed by Google based on the large language model (LLM)

Data storage

Get expert guidance on choosing between data warehouse, data lake, or a mix of the two for data storage

Open-source solutions

Apache HDFS

Apache HDFS

Distributed file system for scalable storage and processing of large datasets across clusters

Apache druid

Apache Druid

Real-time analytics database for fast, scalable querying and ingesting of large event streams

Clickhouse

ClickHouse

Columnar database management system optimized for high-performance real-time analytics on large datasets

Ceph

Ceph

Distributed storage system providing scalable object, block, and file storage for large data environments

MinIO

MinIO

High-performance object storage system compatible with the S3 API for cloud-native environments

Paid solutions

Amazon S3

Amazon S3

Scalable object storage service for secure data storage, retrieval, and backup in the cloud

Azure Data Lake storage

Azure Data Lake Storage

Secure cloud storage service optimized for big data analytics and processing

Google Cloud storage

Google Cloud Storage

Scalable, secure object storage solution for unstructured data, with built-in analytics and backup

IBM Cloud Object Storage

IBM Cloud Object Storage

High-security scalable cloud object storage for storing and managing unstructured data

Snowflake

Snowflake

Cloud-based data platform for scalable data warehousing, analytics, and secure data sharing

Databricks

Databricks

Unified data analytics platform for big data processing, machine learning, and collaborative data science

Google BigQuery

Google BigQuery

Fully managed, serverless data warehouse for fast, scalable analytics on large datasets

Azure Blob Storage

Azure Blob Storage

Scalable object storage service for unstructured data, optimized for cloud applications and analytics

Data ingestion

Scalable and cost-effective data ingestion tools for batch processing and data streaming.

Open-source solutions

Airbyte

Airbyte

Syncing data between APIs, databases, and warehouses

Singer

Singer

Extracting, transforming, and loading data

Logstash

Logstash

Collecting, transforming, and forwarding data in real time.

Fluentd

Fluentd

Unifying and processing logs and event data in real-time

Apache Kafka

Apache Kafka

Real-time data pipelines and applications

Redpanda

Redpanda

High-performance, low-latency real-time data processing

Paid solutions

IBM InfoSphere DataStage

IBM InfoSphere DataStage

Designing, developing, and running data integration workflows

Oracle GoldenGate

Oracle GoldenGate

Transactional data management

SAP Data Services

SAP Data Services

Data integration, transformation, and cleansing

Google Cloud Data Fusion

Google Cloud Data Fusion

Building and managing scalable data integration pipelines

Azure Data Factory

Azure Data Factory

Creating, scheduling, and orchestrating ETL workflows at scale

AWS Glue

AWS Glue

Discovering, preparing, and integrating data for analytics and ML

Azure Event Hubs

Azure Event Hubs

Real-time event ingestion and processing

Google Pub/Sub

Google Pub/Sub

Real-time messaging service

AWS Kinesis Data Streams

AWS Kinesis Data Streams

Collecting, processing, and analyzing streaming data

Achieve your goals with an experienced software development partner

triangle decor

Data processing and transformation

Build a tech stack for processing raw data and transforming it to fit business logic

Open-source solutions

Apache Flink

Apache Flink

Stream processing framework for real-time, scalable, and distributed data processing and analytics

Apache Spark

Apache Spark

Unified analytics engine for large-scale data processing, featuring batch and real-time streaming capabilities

TensorFlow

TensorFlow

Platform for machine learning and deep learning, used for building and deploying AI models

Dbt

Dbt

Data transformation tool enabling analytics engineers to transform, test, and document data in SQL

Paid solutions

Azure Data Factory

Azure Data Factory

Cloud-based data integration service for orchestrating and automating data movement and transformation at scale

Databricks

Databricks

Unified data platform for big data processing, machine learning, and collaborative data engineering

AWS Glue

AWS Glue

Fully managed ETL service for data discovery, preparation, and integration across various data sources

Google Dataflow

Google Dataflow

Fully managed service for stream and batch data processing using Apache Beam pipelines

AWS Lambda

AWS Lambda

Serverless compute service for running code in response to events, without managing infrastructure

Azure Stream Analytics

Azure Stream Analytics

Let’s Build Your AI Solution

Xenoss is your go-to partner in building custom AI and Data Engineering software and an extension of your tech team.

stars

Xenoss team helped us build a well-balanced tech organization and deliver the MVP within a very short timeline. I particularly appreciate their ability to hire extreme fast and to generate great product ideas and improvements.

Oli Marlow Thomas

Oli Marlow Thomas,

CEO and founder, AdLib

Get a free consultation

We are ready to help with tech challenges you might have.