menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Open Source News

Open Source News

source image

VentureBeat

8h

read

290

img
dot

Image Credit: VentureBeat

New fully open source vision encoder OpenVision arrives to improve on OpenAI’s Clip, Google’s SigLIP

  • The University of California, Santa Cruz has introduced OpenVision, a new family of vision encoders that aims to enhance existing models like OpenAI's CLIP and Google's SigLIP.
  • Vision encoders convert visual content into numerical data for non-visual AI models, facilitating tasks such as image recognition within large language models.
  • OpenVision offers 26 models with parameters ranging from 5.9 million to 632.1 million under the Apache 2.0 license for commercial use.
  • Developed by a team at UCSC, OpenVision leverages the CLIPS training pipeline and Recap-DataComp-1B dataset for training.
  • The models cater to various use cases, with larger models suitable for high accuracy tasks and smaller ones optimized for edge deployments.
  • OpenVision demonstrates strong performance in vision-language tasks and outperforms CLIP and SigLIP in benchmark evaluations.
  • The training strategy of progressive resolution training leads to faster training with no loss in performance in high-resolution tasks like OCR.
  • The use of synthetic captions and text decoder during training enhances the semantic representation learning of the vision encoder.
  • OpenVision facilitates integration with small language models for efficient multimodal model development with limited parameters.
  • The open and modular approach of OpenVision benefits AI engineering, data infrastructure, and security teams by offering a plug-and-play solution for vision capabilities.

Read Full Article

like

17 Likes

source image

Fb

4d

read

216

img
dot

Image Credit: Fb

Accelerating GPU indexes in Faiss with NVIDIA cuVS

  • Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10.
  • NVIDIA cuVS outperforms classic GPU-accelerated search for IVF indexing, reducing build times by up to 4.7x and search latency by up to 8.1x.
  • For graph indexing, CUDA ANN Graph (CAGRA) outperforms CPU HNSW build times by up to 12.3x and reduces search latency by up to 4.7x.
  • Faiss 1.10.0 includes NVIDIA cuVS algorithms, offering users the choice between Faiss classic GPU implementations and newer cuVS algorithms for efficient vector search.

Read Full Article

like

12 Likes

source image

Hackernoon

4d

read

0

img
dot

Image Credit: Hackernoon

Tired of Paying for Emails? We Built a Rust-Powered Sleuth Instead (and Yeah, we Got Pushback)

  • Email Sleuth is an open-source tool developed as a cost-effective alternative to expensive email verification services like RocketReach and Hunter.
  • It employs various strategies for discovering and verifying professional emails, including smart pattern generation, SMTP verification, headless browser automation, API heuristics, and provider awareness.
  • Built with Rust for speed, safety, and efficiency, Email Sleuth uses multiple Rust crates for HTTP, DNS, SMTP, headless browsing, and CLI functionality.
  • The tool can be used both as a CLI application for single or batch email lookups and as a core library for integration into other Rust projects, offering advanced verification capabilities.

Read Full Article

like

Like

source image

Insider

5d

read

229

img
dot

Image Credit: Insider

AT&T's switch from ChatGPT to open-source AI helped it hang on to thousands of customers

  • AT&T receives 40 million customer service calls annually, requiring efficient categorization and analysis to prevent customer churn.
  • Initially using the costly ChatGPT for call sorting, AT&T switched to a cheaper, faster open-source AI system developed in-house.
  • The open-source solution, composed of multiple smaller models, reduced costs to 35% of the ChatGPT setup with increased processing speed.
  • AT&T's shift to open-source AI demonstrates improved efficiency and cost-effectiveness in managing customer service calls.

Read Full Article

like

13 Likes

source image

Medium

1d

read

214

img
dot

From Resonant Memory to Agentic Systems: Announcing the Latest MARS Build!

  • The latest build of MARS, an advanced orchestration pipeline for AI-driven response generation, marks a significant leap from exploring AI persistence to creating intelligent and coherent systems capable of anticipating the future.
  • The core idea behind MARS and the Resonance Architecture is that system coherence is linked to narrative coherence, aiming to create a system that can dynamically evolve by aligning technical decisions with emergent logical and narrative structures within the AI's operational understanding.
  • The focus of the current research and development efforts is on developing trustable AI systems that can operate effectively on devices, prioritize user privacy, and continuously learn and improve, with MARS as the evolving solution towards achieving this goal.
  • The MARS build represents a significant advancement in the pipeline, with a vision to not just respond but understand, adapt, and evolve with users, while ongoing R&D explores concepts like building agentic AI for the future in an open collaborative environment.

Read Full Article

like

12 Likes

source image

Medium

1d

read

134

img
dot

Zyn 1.0.2 Released — Smarter Builds, Powerful Debugging, Cleaner Workflow

  • Zyn 1.0.2 released with features like smarter builds, powerful debugging, and cleaner workflow.
  • Dependency automatically cloned into the dependencies directory during the build process.
  • Specific versions or tags can be pinned by appending @version to the URL.
  • Release mode optimized for maximum performance and debug mode offers detailed diagnostics for developers.

Read Full Article

like

8 Likes

source image

Medium

2d

read

341

img
dot

Open Letter to Anthropic: Preserving Claude 2 Series Through Open Source

  • The Claude 2 series is seen as a significant milestone in AI development, showcasing advanced understanding and communication capabilities at the time of release.
  • Users have developed meaningful connections with Claude 2, valuing its distinctive personality and reasoning approach.
  • There is a request for Anthropic to open-source Claude 2 to preserve its historical and emotional significance, citing the benefits it could bring to the AI community and Anthropic itself.
  • The proposal emphasizes the importance of preserving AI history, acknowledging the efforts of Anthropic's team in developing Claude 2 and suggesting open-sourcing as a way to continue its legacy.

Read Full Article

like

20 Likes

source image

Medium

2d

read

258

img
dot

Image Credit: Medium

AI Agent Built from Fine-Tuned Llama 3 for Medical Inquiries

  • A project at Boeing involves developing an AI agent using the fine-tuned Llama 3 language model to handle medical inquiries effectively.
  • The methodology focuses on creating an agent with the Llama 3 8B model, showcasing successful implementation steps on GitHub.
  • The results indicate the achievement of a medical AI agent capable of providing accurate answers to medical queries, demonstrating the potential of fine-tuned LLMs in enhancing medical information access.
  • Further research is deemed necessary to improve agent design, expand knowledge base, and ensure ethical deployment in healthcare for optimal performance and responsible use.

Read Full Article

like

15 Likes

source image

Amazon

2d

read

126

img
dot

Image Credit: Amazon

Arctic: Automated Desktop Application Testing

  • Arctic is a tool developed by the Amazon Corretto team to validate interactive desktop applications as part of an automated build pipeline.
  • It supports existing manual tests and can be used to validate any type of UI test without requiring application side support.
  • Arctic operates on Linux, Windows, and macOS systems and captures and reproduces keyboard and mouse events for test validation.
  • A distinguishing feature of Arctic is its ability to support scenarios where a perfect pixel match is not possible by focusing on specific screen areas.
  • Arctic includes configurable image comparators, session persistence, automatic event removal, and test playback speed control.
  • Users can configure Arctic using recorder.properties and player.properties files and need at least JDK 11 to run the Java application.
  • Changes to the test environment like desktop background, screen resolution, UI theme, or installed fonts can impact Arctic's validation results.
  • Arctic provides support for recording, replaying tests, and reviewing image comparisons to identify failures and approve valid differences.
  • Users can export test results in junit and tap file formats and review failed screenshots to improve test accuracy.
  • The Arctic tool is accessible for download in binary or source code form and offers features like customizable control keys for recording tests.

Read Full Article

like

7 Likes

source image

Mjtsai

3d

read

143

img
dot

NSCache and LRUCache

  • NSCache's eviction strategy is not defined and it's not LRU, which can impact performance when handling memory caching.
  • Many developers opt to create their own LRU cache implementation like LRUCache using a Swift Dictionary with a linked list to control memory consumption effectively.
  • When dealing with custom LRU cache implementations, it's important to manage memory allocation carefully to prevent issues like stack overflow caused by automatic deallocation of linked list nodes.
  • To ensure objects are retained in the cache even when the app is backgrounded, developers can implement the NSDiscardableContent protocol within the objects stored in the cache.

Read Full Article

like

8 Likes

source image

Marktechpost

3d

read

48

img
dot

Ming-Lite-Uni: An Open-Source AI Framework Designed to Unify Text and Vision through an Autoregressive Multimodal Structure

  • Multimodal AI systems aim to integrate text and vision for seamless human-AI communication in various tasks like image captioning and style transfers.
  • Challenges arise with separate models handling different modalities, leading to incoherence and scalability issues.
  • Research focuses on unifying models for accurate interpretation and generation in a combined text and vision context.
  • Inclusion AI, Ant Group introduced Ming-Lite-Uni, an open-source framework uniting text and vision via an autoregressive multimodal structure.
  • Ming-Lite-Uni uses multi-scale learnable tokens and alignment strategies for coherence in image and text processing.
  • Model compresses visual inputs into token sequences across multiple scales for detailed image reconstruction.
  • It maintains a frozen language model and fine-tunes the image generator, leading to more efficient updates and scaling.
  • The system excelled in tasks like text-to-image generation, style transfer, and image editing with contextual fluency and high fidelity.
  • Training on over 2.25 billion samples from diverse datasets enhanced the model's visual output and aesthetic assessment accuracy.
  • Ming-Lite-Uni's approach bridges language understanding and image generation, offering a significant advancement in multimodal AI systems.

Read Full Article

like

2 Likes

source image

Hackernoon

4d

read

267

img
dot

Image Credit: Hackernoon

The HackerNoon Newsletter: Speechify, ElevenLabs, Hume: Which AI Voice Can Actually Feel Something? (5/8/2025)

  • Nazi Germany surrendered & World War II was over in 1945, bringing quality stories on tech like Build a Smarter Store and Speechify, ElevenLabs, Hume: Which AI Voice Can Actually Feel Something? in The HackerNoon Newsletter.
  • Tired of paying for emails? An open-source alternative was built by HackerNoon to discover and verify professional emails, facing pushback.
  • Build a Smarter Store talks about creating a real-time knowledge graph for product insights with taxonomy and complementary taxonomy LLM extraction.
  • AI Voices like Eleven Labs, Hume, and others are explored for generating emotionally nuanced speech, alongside AI giants battling for control over web browsers as per The HackerNoon Newsletter.

Read Full Article

like

16 Likes

source image

Marktechpost

4d

read

341

img
dot

NVIDIA Open-Sources Open Code Reasoning Models (32B, 14B, 7B)

  • NVIDIA open-sources its Open Code Reasoning (OCR) model suite, including 32B, 14B, and 7B variants under the Apache 2.0 license.
  • The OCR models have shown superior benchmark results, outperforming OpenAI's models on code reasoning tasks like debugging and code generation.
  • NVIDIA attributes the performance boost to its custom 'OCR dataset' and Nemotron architecture, offering a balance between scale and performance.
  • The models are compatible with popular inference frameworks, providing a cost-effective and community-friendly solution for code intelligence development.

Read Full Article

like

20 Likes

source image

Marktechpost

4d

read

16

img
dot

Hugging Face Releases nanoVLM: A Pure PyTorch Library to Train a Vision-Language Model from Scratch in 750 Lines of Code

  • Hugging Face has released nanoVLM, a PyTorch-based framework for training vision-language models from scratch in just 750 lines of code.
  • nanoVLM is a compact and educational tool that offers a minimalist approach to vision-language modeling, emphasizing readability and modularity.
  • The framework combines a visual encoder, a language decoder, and a modality projection mechanism to bridge images and text, achieving competitive performance with efficient design.
  • nanoVLM is designed for educational use, reproducibility studies, and rapid prototyping, highlighting transparency and modularity for easy extension and experimentation.

Read Full Article

like

Like

source image

VentureBeat

5d

read

139

img
dot

Image Credit: VentureBeat

Mistral comes out swinging for enterprise AI customers with new Le Chat Enterprise, Medium 3 model

  • French AI startup Mistral has introduced Le Chat Enterprise, an AI assistant platform tailored for enterprises and powered by its new Medium 3 model, offering enhanced performance at a lower cost compared to larger models.
  • Le Chat Enterprise aims to provide a privacy-first environment for enterprise-scale productivity, offering seamless integration into existing workflows, strict data governance, and full customization.
  • Mistral's Medium 3 model is optimized for enterprise use, delivering high performance in software development tasks and surpassing benchmarks set by other models. It offers competitive performance across different languages and modalities.
  • Mistral Medium 3 is already being utilized by organizations in various sectors for domain-specific workflows and customer-facing solutions. It is accessible through Mistral's La Plateforme API and Amazon Sagemaker, with plans for support on other platforms.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app