12/12/2024 Informative Is PostgreSQL the Only Vector Database You Will Ever Need? PostgreSQL provides pgvector extension, an efficient extension for LLM-based applications to replace costly dedicated vector databases.
11/29/2024 Company News Cloud Native Meetup Recap Karlsruhe offers a vibrant tech scene and we are proud to be part of a group organizing expert & community meetups like this one.
11/22/2024 Informative AI Summarization of Long Documents: Tackling Long Documents with Precision Long texts are difficult to summarize, but recursion can divide them into small parts. The approach is precise and preserves the meaning at any step of iteration.
10/29/2024 Tutorial Full Metal Buying a used server on ebay kleinanzeigen and preparing it to be cloudified? Follow along to see what it takes to get a piece of metal running.
10/17/2024 Tutorial Structure PDF Table Data for AI Applications with GMFT GMFT is a fast, lightweight toolkit for extracting tables from PDFs into formats like CSV, JSON, and Pandas DataFrames. Leveraging Microsoft's Table Transformer, GMFT efficiently processes both text and image tables, ensuring high performance for reliable data extraction.
10/2/2024 Tutorial Extracting Metadata from PDFs with Named Entity Recognition using Spacy NER identifies entities like people and locations in text. SpaCy automates this with pre-trained models, offering accuracy, speed, and multi-language support. It excels at handling large datasets efficiently compared to rule-based methods.
9/9/2024 Informative Taking Advantage of the Long Context of Llama 3.1 Llama 3.1 allows a context window of 128k tokens. We investigate how to take advantage of this long context of Llama without running into performance issues.
8/28/2024 Informative Exploring Options for Open-Source Multimodels in 2024 The ability of multimodels to understand several data sources like text, audio, and images enables them to understand and generate nuanced, accurate, and contextually aware responses. We explored some of the best open-source multi-models available out there.
8/12/2024 Informative Best sub 7b AI model for categorizing documents in August 2024 Documents classification can be achieved by using a fine-tuned discriminative model or doing some prompt engineering and using a generative model for the task.
8/2/2024 Informative Best Open Source Sentence Embedding Models in August 2024 Sentence embedding models capture the overall semantic meaning of the text. We tested and compiled the best-performing open-source models for you.
7/29/2024 Tutorial Deploying Faster-Whisper on CPU Learn how to deploy a faster whisper server to increase transcriptions speeds by 4x and enabling real-time voice transcription on CPU only hardware.
6/27/2024 Informative Example Project for a Landscape Deployment on Codesphere This template utilizes Codesphere's new Landscape Deployment feature, which allows us to deploy complex monorepo applications.