menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Open Source News

Open Source News

source image

Marktechpost

4w

read

388

img
dot

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

  • The Allen Institute for AI introduced olmOCR, an open-source Python toolkit for converting PDFs into structured text with logical reading order.
  • Traditional OCR tools face challenges in extracting coherent text from PDFs due to their visual layout emphasis over logical flow.
  • olmOCR leverages a 7-billion-parameter VLM, fine-tuned on 260,000 PDF pages, for accurate extraction by integrating text and visual data.
  • Using document anchoring, olmOCR aligns text metadata with visual elements to enhance model accuracy and readability.
  • The toolkit processes one million PDF pages for $190, significantly more cost-efficient compared to other systems like GPT-4o.
  • olmOCR surpasses competitors in accuracy and efficiency, achieving an alignment score of 0.875 and excelling in structured content recognition.
  • Through human evaluation, olmOCR received the highest ELO rating among OCR methods and improved language model training by 1.3% in benchmark tasks.
  • The system is compatible with inference frameworks like vLLM and SGLang, facilitating deployment across hardware setups.
  • olmOCR's innovation lies in combining textual and image-based analysis for improved extraction accuracy and structured data recognition.
  • The toolkit's cost-effectiveness, high accuracy, and compatibility make it a valuable resource for large-scale document processing and language model training.

Read Full Article

like

23 Likes

source image

TechCrunch

4w

read

272

img
dot

Image Credit: TechCrunch

Continue wants to help developers create and share custom AI coding assistants

  • Continue, a startup founded in June 2023 by CEO Ty Dunn and CTO Nate Sesti, aims to help developers create custom AI coding assistants that seamlessly integrate with different models and development environments.
  • Continue is known as an open-source AI code assistant allowing teams to personalize autocomplete suggestions and chat features within their coding environments.
  • The launch of Continue's v1.0 product, backed by $3 million in seed funding, arrives amidst a surge in AI coding assistants like GitHub Copilot, Google's Gemini Code Assist, and others.
  • Continue positions itself as a platform where developers can pull in context from platforms like Jira or Confluence to enhance their coding experiences.
  • The startup also introduces a new hub similar to Docker Hub or GitHub, allowing developers to create and share custom AI code assistants and blocks from partners like Mistral, Anthropic, and Ollama.
  • By fostering a 'culture of contribution,' Continue encourages developers to create and share customizations, contrasting with the closed-source nature of some AI assistant providers.
  • Continue focuses on data control, allowing companies to retain ownership of their data and decide how much they want to share, unlike 'one-size-fits-all' platforms.
  • The startup targets developers of all sizes, offering a free solo tier and paid options for organizations needing enhanced administration, governance, and security features.
  • Continue's funding efforts have seen support from developer-focused VC firm Heavybit, enabling the startup to expand its team and advance its open-source distribution model.

Read Full Article

like

16 Likes

source image

Marktechpost

4w

read

152

img
dot

DeepSeek AI Releases DeepGEMM: An FP8 GEMM Library that Supports both Dense and MoE GEMMs Powering V3/R1 Training and Inference

  • DeepSeek AI has released DeepGEMM, an FP8 GEMM library for efficient matrix multiplications in deep learning and high-performance computing.
  • DeepGEMM supports both standard and Mix-of-Experts (MoE) grouped GEMMs, leveraging NVIDIA Hopper tensor cores.
  • The library utilizes fine-grained scaling and a two-level accumulation strategy for accurate FP8 arithmetic without compromising performance.
  • DeepGEMM offers clear efficiency improvements with speedups of up to 2.7x for normal GEMMs and 1.1x to 1.2x for grouped GEMMs.

Read Full Article

like

9 Likes

source image

Medium

4w

read

102

img
dot

Image Credit: Medium

My Open-Source journey with Sugar Labs

  • Harshit Verma, an undergraduate student, shares his open-source journey with Sugar Labs, an organization dedicated to educational software.
  • He started his journey in June 2024, exploring projects to contribute and came across Sugar Labs, particularly Music Blocks, which caught his interest.
  • Harshit faced the initial challenge of setting up the development environment but successfully raised his first pull request in July 2024, experiencing the collaboration and iterative improvements of open-source.
  • With the goal of creating innovative learning experiences, Harshit plans to continue contributing to Sugar Labs and grow alongside the organization.

Read Full Article

like

6 Likes

source image

Siliconangle

1M

read

299

img
dot

Image Credit: Siliconangle

Snyk launches Secure Developer Program to strengthen open-source security

  • Cybersecurity company Snyk has launched its Secure Developer Program to empower open-source software maintainers with powerful security solutions.
  • Qualifying open-source projects will receive Snyk's enterprise-grade security tools and API access at no cost, along with hands-on support from Snyk's developer relations team.
  • The program aims to fix vulnerabilities in open-source software and enhance global cybersecurity by enabling contributors to create secure code and software.
  • Applications for the program are open to open-source projects with at least 10,000 GitHub stars and no corporate backing.

Read Full Article

like

18 Likes

source image

Kaspersky

1M

read

443

img
dot

Image Credit: Kaspersky

Malicious code in fake GitHub repositories | Kaspersky official blog

  • Researchers at Kaspersky have uncovered a malicious campaign called GitVenom targeting GitHub users.
  • In this campaign, unknown actors created over 200 repositories containing fake projects with malicious code.
  • The repositories appeared legitimate with well-designed README.MD files and a large number of commits, creating the illusion of authenticity.
  • The malicious components found in these repositories include a Node.js stealer, AsyncRAT Trojan, Quasar backdoor, and a clipper.

Read Full Article

like

26 Likes

source image

Marktechpost

1M

read

273

img
dot

DeepSeek AI Releases DeepEP: An Open-Source EP Communication Library for MoE Model Training and Inference

  • DeepSeek AI has released DeepEP, an open-source EP communication library for MoE model training and inference.
  • DeepEP addresses the challenges of communication between GPUs in MoE models, providing optimized all-to-all GPU kernels for efficient data exchange during training and inference.
  • The library includes normal kernels for high throughput and low-latency kernels for responsiveness in real-time applications.
  • DeepEP's performance metrics show significant improvements in communication throughput, latency, and memory usage, leading to faster response times and improved efficiency in training and inference.

Read Full Article

like

16 Likes

source image

Robotsblog

1M

read

443

img
dot

Image Credit: Robotsblog

Open-Source Robot pib wins German Design Award 2025

  • The humanoid robot pib has won the German Design Award 2025 for its innovative design and technological sophistication.
  • pib is an open-source robot that aims to make robotics and AI more accessible to everyone.
  • The German Design Award recognized pib for its excellent product design in the category of 'AI in Product Design Processes.'
  • pib serves as a platform for experimenting with 3D printing, robotics, and artificial intelligence, and is being used as a learning platform in schools and media centers.

Read Full Article

like

26 Likes

source image

Securelist

1M

read

80

img
dot

Image Credit: Securelist

The GitVenom campaign: cryptocurrency theft using GitHub

  • The GitVenom campaign utilizes fake projects with malicious code on GitHub to target users, reflecting a rising trend of using open-source code as a lure for attacks.
  • Threat actors created hundreds of repositories with fake projects like Instagram automation tools and hacking utilities designed to appear legitimate.
  • Repositories contained well-crafted README.md files and artificially inflated commit counts to deceive potential victims.
  • Malicious code was hidden in various programming languages like Python, JavaScript, C, C++, and C#, executing actions different from what was described in the fake projects.
  • The attackers used encrypted scripts, malicious functions, and batch scripts to implant and execute the malicious code within the projects.
  • The malicious payloads aimed to download further components from an attacker-controlled repository, including a Node.js stealer, AsyncRAT implant, Quasar backdoor, and a clipboard hijacker.
  • Potential victims worldwide, with notable activity in Russia, Brazil, and Turkey, have been targeted by the GitVenom campaign over the past few years.
  • It is critical for developers to cautiously assess and verify third-party code from platforms like GitHub to prevent incorporating malicious code into their projects.
  • The campaign's impact has been substantial, with infection attempts continuing globally, emphasizing the need for heightened vigilance in handling open-source code.
  • Reference hashes for infected repository archives are provided as a resource for identification and mitigation of the GitVenom threat.

Read Full Article

like

4 Likes

source image

Silicon

1M

read

313

img
dot

Image Credit: Silicon

AI Start-Up DeepSeek To Open-Source AGI Code

  • Chinese AI start-up DeepSeek is open-sourcing five of its code repositories related to artificial general intelligence (AGI).
  • DeepSeek's goal is to achieve AGI, which refers to an AI that can perform general tasks at least as good as a human.
  • The company aims to share its progress with full transparency and considers every line shared to be collective momentum for the AI community.
  • DeepSeek's technology has gained attention, with various companies and organizations integrating it into their offerings.

Read Full Article

like

18 Likes

source image

Marktechpost

1M

read

53

img
dot

Building a Legal AI Chatbot: A Step-by-Step Guide Using bigscience/T0pp LLM, Open-Source NLP Models, Streamlit, PyTorch, and Hugging Face Transformers

  • This tutorial provides a step-by-step guide to building a Legal AI Chatbot using bigscience/T0pp LLM, Hugging Face Transformers, and PyTorch.
  • The tutorial covers setting up the model, preprocessing legal text, extracting legal entities, creating a document retrieval system using FAISS, and generating responses to legal queries.
  • By integrating open-source resources, this project aims to make legal assistance more accessible and automated.
  • The code examples and Colab notebook are provided in the tutorial for reference.

Read Full Article

like

3 Likes

source image

Marktechpost

1M

read

264

img
dot

Moonshot AI and UCLA Researchers Release Moonlight: A 3B/16B-Parameter Mixture-of-Expert (MoE) Model Trained with 5.7T Tokens Using Muon Optimizer

  • Moonlight is a Mixture-of-Expert (MoE) model developed by Moonshot AI and UCLA, optimized with the Muon optimizer to handle challenges in large language model training.
  • Muon addresses issues like vanishing/exploding gradients, inconsistent updates, and resource demands in training models with billions of parameters and trillions of tokens.
  • The Muon optimizer uses matrix orthogonalization through Newton-Schulz iterations to ensure uniform gradient updates across the model.
  • Technical adjustments to Muon include integrating weight decay and scaling updates to align with AdamW's performance.
  • Muon's distributed implementation reduces memory overhead and communication costs in large-scale training environments.
  • Empirical evaluations show that Moonlight trained with Muon outperformed other models in language understanding and code generation tasks.
  • Scaling law experiments demonstrate Muon's ability to match AdamW performance with reduced computational cost.
  • Moonlight's training with Muon leads to a diverse range of singular values in weight matrices, aiding generalization across tasks.
  • The project demonstrates improvements in training efficiency and stability, providing a viable alternative to traditional optimization methods.
  • The open-sourcing of Muon implementation is expected to encourage further research into scalable optimization techniques for large language models.
  • Transitioning from AdamW to Muon does not require extensive tuning, simplifying the integration process for researchers.

Read Full Article

like

15 Likes

source image

Marktechpost

1M

read

291

img
dot

Stanford Researchers Introduce OctoTools: A Training-Free Open-Source Agentic AI Framework Designed to Tackle Complex Reasoning Across Diverse Domains

  • Stanford researchers introduced OctoTools, an agentic AI framework enhancing reasoning capabilities by facilitating dynamic, structured external tool usage.
  • OctoTools overcomes limitations of existing frameworks by standardizing AI interactions with external tools using modular 'tool cards.'
  • The framework consists of planner, executor, and verifier phases, optimizing tool selection, command execution, and result verification.
  • OctoTools outperformed other frameworks, achieving an average 9.3% accuracy improvement over GPT-4o across diverse tasks.
  • It demonstrated significant enhancements in vision, math, medical, and scientific domains, with accuracy boosts ranging from 7.4% to 22.5%.
  • The task-specific toolset optimization algorithm improved efficiency, reducing computational costs and enhancing performance.
  • OctoTools supports structured problem-solving and multi-step reasoning without requiring extensive model retraining.
  • The framework's adaptability to new domains, cost-effectiveness, and scalability make it an effective solution for AI-driven decision-making.
  • Researchers extensively evaluated OctoTools across 16 benchmarks, showcasing its superior performance in various applications.
  • For more details, refer to the research paper and GitHub page for OctoTools.

Read Full Article

like

17 Likes

source image

Marktechpost

1M

read

211

img
dot

Google DeepMind Research Releases SigLIP2: A Family of New Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

  • Google DeepMind Research has introduced SigLIP2, a new family of multilingual vision-language encoders focusing on improved semantic understanding, localization, and dense features.
  • Traditional vision-language models have limitations in fine-grained localization and dense feature extraction, impacting tasks requiring precise spatial reasoning.
  • SigLIP2 blends captioning-based pretraining with self-supervised methods to enhance semantic representation and detailed feature capturing.
  • The model employs a mix of multilingual data, de-biasing techniques, and a sigmoid loss for balanced global and local feature learning.
  • Technical aspects include a decoder-based loss, MAP head for feature pooling, and NaFlex variant for preserving native aspect ratios.
  • Experimental results showcase improvements in zero-shot classification, multilingual tasks, and dense prediction tasks like segmentation and depth estimation.
  • SigLIP2 shows reduced biases in tasks like referring expression comprehension and open-vocabulary detection, emphasizing fairness and robust performance.
  • The model's ability to handle various resolutions and configurations while maintaining performance highlights its potential for research and practical applications.
  • By incorporating multilingual support and de-biasing measures, SigLIP2 demonstrates a balanced approach addressing technical challenges and ethical considerations.
  • The release of SigLIP2 sets a promising benchmark for vision-language models, offering versatility, reliability, and inclusivity in its approach.
  • SigLIP2's compatibility with previous versions and emphasis on fairness make it a significant advancement in vision-language research and application.

Read Full Article

like

12 Likes

source image

Digitaltrends

1M

read

5k

img
dot

Image Credit: Digitaltrends

DeepSeek invites users behind the curtain of its open source AI code

  • Chinese startup, DeepSeek, plans to make its open-source AI code repositories available to the public.
  • This move aims to provide developers and researchers with a deeper understanding of DeepSeek's code.
  • The transparency initiative may help dispel concerns and contribute to quelling security concerns.
  • DeepSeek's strategy aligns with the growing trend of open-source development in the AI industry.

Read Full Article

like

27 Likes

For uninterrupted reading, download the app