menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Silicon

7d

read

132

img
dot

Image Credit: Silicon

Silicon In Focus Podcast: Tech in 2025

  • Join the Silicon In Focus Podcast: Tech in 2025 to explore the technologies shaping our future.
  • Steven Webb, UK Chief Technology & Innovation Officer at Capgemini UK, discusses the transformative impact of AI and quantum computing.
  • The podcast highlights how AI will redefine work, sustainability, and customer experiences.
  • Get insights into the challenges and opportunities ahead in the tech landscape.

Read Full Article

like

7 Likes

source image

Marktechpost

7d

read

407

img
dot

Meet Moxin LLM 7B: A Fully Open-Source Language Model Developed in Accordance with the Model Openness Framework (MOF)

  • Researchers from various universities and organizations have released Moxin LLM 7B, a fully open-source language model.
  • It was developed under the Model Openness Framework (MOF) and provides comprehensive access to its code, datasets, and checkpoints.
  • Moxin LLM 7B offers a robust option for NLP and coding applications, with features like grouped-query attention and sliding window attention.
  • The model's strong performance in zero-shot and few-shot evaluations demonstrates its capability for complex reasoning and multitask challenges.

Read Full Article

like

24 Likes

source image

Netflixtechblog

7d

read

322

img
dot

Image Credit: Netflixtechblog

Introducing Configurable Metaflow

  • Metaflow has introduced a new feature, Configs, to complement the existing Metaflow constructs of artifacts and Parameters, by allowing you to configure all aspects of the flow, decorators in particular, prior to any run starting.
  • The new feature allows a flow to be configurable using configuration files, so variants can be defined without changing the code. Configurations can be used more widely in flow code and can be used to set defaults for parameters.
  • Metaflow Configs are all stored as artifacts by Metaflow but differ in when they are persisted. Configuration parameters are resolved and persisted at the start of a run, while Metaflow Configs are resolved and persisted when the flow is deployed.
  • Configurations can be read from a pleasantly human-readable configuration file and specifications can be made for a triggering ‘@schedule’ and ‘@resource’ requirements.
  • Metaflow has always supported sweeping over parameter grids easily using foreaches, but altering the flow itself, for instance, to change @resources or @pypi/@conda dependencies for every experiment, hasn’t been easily possible. Configuration managers like Hydra allow for orchestration of experiments over multiple configurations or sweeping over parameter spaces.
  • Metaflow has partnered with Netflix’s Metaboost, an internal Netflix CLI tool that helps ML practitioners manage, develop and execute their cross-platform projects, to provide a single interface to three internal platforms at Netflix that manage ETL/Workflows (Maestro), Machine Learning Pipelines (Metaflow) and Data Warehouse Tables (Kragle).
  • Metaboost uses a configuration system that combines GIT-based parameters, global configurations and arbitrarily bound configuration files for use during execution against internal Netflix platforms.
  • Metaflow has made it incredibly simple to integrate this configuration system with Configs. Users can simply add a mix-in class to their FlowSpec and reference the configuration values in steps or decorators.
  • Configs allow for reproducible, consistent, low-boilerplate and easily configurable experiments and robust production deployments.
  • Configs turn into dictionary artifacts, they get versioned and persisted automatically as artifacts. You can access Configs of any past runs easily through the Client API.

Read Full Article

like

18 Likes

source image

Marktechpost

7d

read

182

img
dot

How AI Models Learn to Solve Problems That Humans Can’t

  • Researchers have developed the Easy-to-Hard Generalization (E2H) methodology to tackle alignment issues in complex tasks without relying on human feedback.
  • The methodology involves Process-Supervised Reward Models (PRMs), Easy-to-Hard generalization, and Iterative Refinement.
  • The E2H methodology enables AI models to shift from human-feedback-dependent to reduced human annotations.
  • The method demonstrates significant improvements in performance and reduces the need for human-labeled data on complex tasks.

Read Full Article

like

10 Likes

source image

Marktechpost

7d

read

270

img
dot

Scaling Language Model Evaluation: From Thousands to Millions of Tokens with BABILong

  • BABILong is a new benchmark designed to evaluate language models’ reasoning capabilities across long documents. The benchmark employs a new methodology for testing long-term reasoning abilities. The benchmark’s flexibility allows for testing sequences of up to 50 million tokens, making it uniquely suited for evaluating next-generation models. Initial testing reveals significant limitations in current models, with popular LLMs effectively utilizing only 10-20% of available context.
  • Testing across various question-answering tasks demonstrates that most current LLMs efficiently use only 10-20% of their advertised context window. While models like GPT-4 and Llama-3.1-70b maintain effectiveness up to 16K tokens, most models struggle beyond 4K tokens. Fine-tuning experiments proved particularly revealing, showing that even relatively small models like RMT and ARMT (137M parameters) can effectively handle BABILong tasks.
  • BABILong encompasses 20 distinct reasoning tasks and utilizes books from the PG19 corpora as source material. Notably, this synthetic approach ensures immunity to training data contamination. Testing revealed that even advanced language models like GPT-4 and Gemini 1.5 Pro, utilize only 5-25% of their input context effectively.
  • The LRA, L-Eval, and LongBench workbook handles sequences up to 16,000- 60,000 tokens. However, recent developments like InfinityBench and ChapterBreak can handle up to 636,000 tokens. The BABILong benchmark allows unlimited scaling of context length, enabling the evaluation of models with context windows of millions of tokens.
  • The new methodology for evaluating language models is unique and helps evaluate long-term reasoning abilities. Fine-tuned recurrent memory models, particularly ARMT, show remarkable capabilities, processing sequences up to 50 million tokens with consistent performance.
  • Recently developed benchmarks like Long Align, LongICLBench, InfinityBench, ChapterBreak, and BABILong focus on in-context learning and instruction and push further boundaries by handling sequences up to 636K tokens.
  • Due to advancements in Large Language Models (LLMs) and neural architectures have significantly advanced capabilities, particularly by improving processing longer contexts. These improvements have profound implications for various applications.
  • The BABILong benchmark is a significant advancement in evaluating language models’ long-context capabilities through its unique combination of scalability and diverse reasoning tasks.
  • Recent developments show promising improvements, with Qwen-2.5 models leading among open LLMs. The evaluation also explored alternative approaches, including Retrieval-Augmented Generation (RAG) and fine-tuned models. While RAG demonstrates limited success, fine-tuned recurrent memory models, particularly ARMT, show remarkable capabilities.
  • Current evaluation benchmarks remains limited to 40,000 tokens, creating a significant gap between model capabilities and evaluation methods.

Read Full Article

like

16 Likes

source image

Medium

7d

read

394

img
dot

Exploring Real-World Use Cases of Machine Learning Development Across Industries

  • Machine learning has revolutionized healthcare, enabling early disease detection and personalized treatment plans.
  • Retailers use machine learning to analyze customer behavior and improve personalization.
  • Manufacturers employ machine learning to streamline production processes and reduce downtime.
  • The financial sector benefits from AI-driven solutions for fraud detection and customer relationship management.

Read Full Article

like

23 Likes

source image

Arxiv

7d

read

83

img
dot

Image Credit: Arxiv

Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs

  • This paper explores the application of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks (WLANs).
  • The study focuses on the practical scenario where agents adopt heterogeneously value-based or policy-based reinforcement learning algorithms to train the model.
  • The researchers propose a heterogeneous MARL training framework called QPMIX, which enables collaboration among heterogeneous agents through centralized training and distributed execution.
  • Simulation results demonstrate that the QPMIX algorithm improves network throughput, mean delay, delay jitter, and collision rates compared to conventional carrier-sense multiple access with collision avoidance in saturated traffic scenarios.

Read Full Article

like

5 Likes

source image

Arxiv

7d

read

78

img
dot

Image Credit: Arxiv

A Survey on Inference Optimization Techniques for Mixture of Experts Models

  • A new survey analyzes inference optimization techniques for Mixture of Experts (MoE) models.
  • The survey categorizes optimization approaches into model-level, system-level, and hardware-level optimizations.
  • Model-level optimizations include architectural innovations, compression techniques, and algorithm improvements.
  • System-level optimizations investigate distributed computing approaches, load balancing mechanisms, and efficient scheduling algorithms.

Read Full Article

like

4 Likes

source image

Arxiv

7d

read

390

img
dot

Image Credit: Arxiv

Towards Precise Prediction Uncertainty in GNNs: Refining GNNs with Topology-grouping Strategy

  • Recent advancements in graph neural networks (GNNs) have highlighted the critical need of calibrating model predictions, with neighborhood prediction similarity recognized as a pivotal component.
  • Existing approaches incorporate neighborhood similarity into node-wise temperature scaling techniques, but this assumption does not hold universally and can lead to sub-optimal calibration.
  • The new approach called Simi-Mailbox categorizes nodes by both neighborhood similarity and their own confidence, allowing fine-grained calibration using group-specific temperature scaling.
  • Extensive experiments demonstrate the effectiveness of Simi-Mailbox, achieving up to 13.79% error reduction compared to uncalibrated GNN predictions.

Read Full Article

like

23 Likes

source image

Arxiv

7d

read

303

img
dot

Image Credit: Arxiv

Distributionally Robust Policy Learning under Concept Drifts

  • Distributionally robust policy learning aims to find a policy that performs well under the worst-case distributional shift.
  • Existing methods for robust policy learning consider the worst-case joint distribution of the covariate and the outcome, which can be unnecessarily conservative.
  • This paper focuses on robust policy learning under concept drift, where only the conditional relationship between the outcome and the covariate changes.
  • The paper proposes a learning algorithm that maximizes the estimated policy value within a given policy class, with an optimal sub-optimality gap.

Read Full Article

like

18 Likes

source image

Arxiv

7d

read

295

img
dot

Image Credit: Arxiv

The Multiplex Classification Framework: optimizing multi-label classifiers through problem transformation, ontology engineering, and model ensembling

  • A new approach called the Multiplex Classification Framework has been introduced to address the complexities of classification problems through problem transformation, ontology engineering, and model ensembling.
  • The framework offers adaptability to any number of classes and logical constraints, a method for managing class imbalance, elimination of confidence threshold selection, and a modular structure.
  • Experiments comparing the Multiplex approach with conventional classification models showed significant improvement in classification performance, especially in problems with a large number of classes and class imbalances.
  • However, the Multiplex approach requires understanding of the problem domain and experience with ontology engineering, and involves training multiple models.

Read Full Article

like

17 Likes

source image

Arxiv

7d

read

270

img
dot

Image Credit: Arxiv

Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning

  • Dyna-style off-policy model-based reinforcement learning (DMBRL) algorithms are facing a performance gap when applied across different benchmark environments.
  • While DMBRL algorithms perform well in OpenAI Gym, their performance drops significantly in DeepMind Control Suite (DMC) with proprioceptive observations.
  • Modern techniques designed to address issues in these settings do not consistently improve performance across all environments.
  • Adding synthetic rollouts to the training process, which is the backbone of Dyna-style algorithms, significantly degrades performance in most DMC environments.

Read Full Article

like

16 Likes

source image

Arxiv

7d

read

124

img
dot

Image Credit: Arxiv

Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

  • This research proposes a training-free method for federated learning with pre-trained models.
  • The method utilizes an unbiased estimator of class covariance matrices.
  • It only requires the communication of class means, reducing communication costs.
  • The approach improves performance by 4-26% compared to existing methods with the same communication cost.

Read Full Article

like

7 Likes

source image

Arxiv

7d

read

307

img
dot

Image Credit: Arxiv

A Unifying Information-theoretic Perspective on Evaluating Generative Models

  • There is significant current research focused on determining meaningful evaluation metrics for generative models.
  • A unifying perspective is needed to allow for easier comparison and clearer explanation of metric benefits and drawbacks.
  • A class of kth-nearest-neighbors (kNN)-based metrics is unified under an information-theoretic lens.
  • A tri-dimensional metric composed of Precision Cross-Entropy (PCE), Recall Cross-Entropy (RCE), and Recall Entropy (RE) is proposed to measure fidelity and diversity.

Read Full Article

like

18 Likes

source image

Arxiv

7d

read

157

img
dot

Image Credit: Arxiv

Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference

  • Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
  • Realtime environments change as agents perform action inference and learning, requiring high interaction frequencies to minimize regret.
  • Recent advances in machine learning involve larger neural networks with longer inference times, raising concerns about their applicability in realtime systems.
  • Proposed algorithms for staggering asynchronous inference processes ensure consistent time intervals for actions, enabling use of models with high inference times.

Read Full Article

like

9 Likes

For uninterrupted reading, download the app