menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

ML News

source image

Arxiv

2d

read

373

img
dot

Image Credit: Arxiv

Combining Physics-based and Data-driven Modeling for Building Energy Systems

  • Building energy modeling combines physics-based and data-driven models for optimizing the operation of building energy systems.
  • Hybrid approaches are being explored, where physics-based model output is used as additional input for data-driven models.
  • A real-world case study on indoor thermodynamics evaluates four predominant hybrid approaches in building energy modeling.
  • The study reveals that hybrid approaches perform better with greater building documentation and sensor availability and hierarchical Shapley values are effective for explaining and improving the models.

Read Full Article

like

22 Likes

source image

Arxiv

2d

read

136

img
dot

Image Credit: Arxiv

MEG: Medical Knowledge-Augmented Large Language Models for Question Answering

  • MEG is a parameter-efficient approach for medical knowledge-augmented large language models (LLMs).
  • MEG utilizes a lightweight mapping network to incorporate knowledge graph embeddings into LLMs.
  • MEG improves the performance of LLMs in interpreting knowledge graph embeddings.
  • MEG outperforms specialized models like BioMistral-7B and MediTron-7B in medical knowledge-based question answering.

Read Full Article

like

8 Likes

source image

Arxiv

2d

read

42

img
dot

Image Credit: Arxiv

A Novel Adaptive Hybrid Focal-Entropy Loss for Enhancing Diabetic Retinopathy Detection Using Convolutional Neural Networks

  • A Novel Adaptive Hybrid Focal-Entropy Loss for Enhancing Diabetic Retinopathy Detection Using Convolutional Neural Networks
  • Diabetic retinopathy is a leading cause of blindness and precise AI-based diagnostic tools are needed.
  • The Adaptive Hybrid Focal-Entropy Loss addresses class imbalance and challenges in diabetic retinopathy detection.
  • The proposed method achieved improved performance with various models, indicating its potential in complex medical datasets.

Read Full Article

like

2 Likes

source image

Arxiv

2d

read

50

img
dot

Image Credit: Arxiv

Program Evaluation with Remotely Sensed Outcomes

  • Economists often estimate treatment effects in experiments using remotely sensed variables (RSVs) like satellite images or mobile phone activity.
  • A common practice is to use an observational sample to train a predictor of the economic outcome from the RSV.
  • However, using this method may introduce bias if the RSV is post-outcome, meaning a variation in the economic outcome causes variation in the RSV.
  • The study proposes a new identification method that requires three predictions of RSVs for more accurate causal inference.

Read Full Article

like

3 Likes

source image

Arxiv

2d

read

206

img
dot

Image Credit: Arxiv

SNN-Based Online Learning of Concepts and Action Laws in an Open World

  • A new study presents the architecture of an autonomous, bio-inspired cognitive agent.
  • The agent is built around a spiking neural network (SNN) that implements its semantic memory.
  • It learns concepts of objects/situations and its own actions in a one-shot manner.
  • The agent can adapt to new situations by relying on previously learned general concepts.

Read Full Article

like

12 Likes

source image

Arxiv

2d

read

128

img
dot

Image Credit: Arxiv

Adapter-Enhanced Semantic Prompting for Continual Learning

  • Continual learning is a challenge where new knowledge overwrites previously acquired knowledge.
  • The paper introduces Adapter-Enhanced Semantic Prompting (AESP) as a lightweight CL framework.
  • AESP utilizes semantic-guided prompts and adapters to enhance generalization and adapt features efficiently.
  • Experiments show that AESP achieves favorable performance in continual learning.

Read Full Article

like

7 Likes

source image

Arxiv

2d

read

27

img
dot

Image Credit: Arxiv

Dataset-Agnostic Recommender Systems

  • Recommender systems have become a cornerstone of personalized user experiences.
  • A new paradigm called Dataset-Agnostic Recommender Systems (DAReS) has been introduced.
  • DAReS enables a single codebase to adapt to multiple datasets without manual fine-tuning.
  • The system utilizes a Dataset Description Language (DsDL) for automated dataset management.

Read Full Article

like

1 Like

source image

Arxiv

2d

read

108

img
dot

Image Credit: Arxiv

Rethinking Timing Residuals: Advancing PET Detectors with Explicit TOF Corrections

  • PET detectors have timing performance degraded by various factors.
  • Researchers developed a residual physics-based calibration approach using machine learning.
  • The approach simplifies data acquisition, improves linearity, and enhances timing performance.
  • Experiments were conducted using two detector stacks with specific components.

Read Full Article

like

6 Likes

source image

Arxiv

2d

read

229

img
dot

Image Credit: Arxiv

Novel computational workflows for natural and biomedical image processing based on hypercomplex algebras

  • This work introduces novel computational workflows for natural and biomedical image processing based on hypercomplex algebras.
  • The workflows utilize quaternions and the two-dimensional orthogonal planes split framework to perform tasks such as image re-colorization, de-colorization, contrast enhancement, computational re-staining, and stain separation.
  • The proposed methodologies exhibit versatility and consistency across different image processing tasks and have the potential to benefit computer vision and biomedical applications.
  • The non-data-driven methods offered in this work achieve comparable or better results compared to existing techniques, highlighting the practical effectiveness of the proposed approaches.

Read Full Article

like

13 Likes

source image

Arxiv

2d

read

11

img
dot

AudioX: Diffusion Transformer for Anything-to-Audio Generation

  • AudioX is a unified Diffusion Transformer model for Anything-to-Audio and Music Generation.
  • It can generate both general audio and music with high quality, and offers flexible natural language control and seamless processing of various modalities including text, video, image, music, and audio.
  • AudioX utilizes a multi-modal masked training strategy to learn from masked inputs across modalities, resulting in robust and unified cross-modal representations.
  • Extensive experiments show that AudioX outperforms state-of-the-art specialized models and exhibits remarkable versatility in handling diverse input modalities and generation tasks within a unified architecture.

Read Full Article

like

Like

source image

Medium

2d

read

338

img
dot

Image Credit: Medium

From BoW to GPT: A Brief History of Language Models and Where They’re Going

  • Language models have evolved from simple word counters to billion-parameter models, built on decades of breakthroughs.
  • The progress from Bag-of-Words (BoW) to n-gram models improved context prediction but lacked understanding.
  • Word2Vec introduced word embeddings, representing words as vectors in a relational space.
  • Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks improved language sequence processing.
  • The introduction of Transformers in 2017 revolutionized language modeling with self-attention.
  • Models like BERT, GPT-2, and T5 are built on the Transformer architecture, focusing on context and relationships.
  • Recent advancements have led to massive models like GPT-3 with 175 billion parameters and instruction-tuned models for various tasks.
  • Future directions include domain-specific models, neuro-symbolic systems, and connecting language models with real-world data.
  • The evolution of language models reflects our evolving understanding of language, culture, and the nuances of communication.
  • The future of language models may prioritize smarter, more understanding models over simply scaling up in size.
  • Understanding the history of language models is essential in grasping how each step contributes to the evolving capabilities of machines in language processing.

Read Full Article

like

20 Likes

source image

Towards Data Science

2d

read

164

img
dot

Exporting MLflow Experiments from Restricted HPC Systems

  • Running MLflow experiments on restricted HPC systems can be challenging, as outbound TCP connections are often limited.
  • To bypass direct communication limitations, a workaround involves setting up a local MLflow server with local directory storage.
  • Steps include creating a virtual environment, installing MLflow, and using mlflow-export-import for data transfer.
  • Exporting experiment data to a local temp folder on the HPC and transferring it to the remote MLflow server is crucial.
  • Installation of Charmed MLflow (MLflow server, MySQL, MinIO) using juju on MicroK8s localhost is recommended.
  • Prerequisites include Python 3.2 loaded on both HPC and MLflow server, with specific configuration settings.
  • Issues like thread utilization errors can occur, but setting thread limits and environment variables can help in resolving them.
  • The process involves exporting experiments, transferring runs to the MLflow server, importing data to MySQL and MinIO, and cleaning up ports.
  • By spinning up a local MLflow server, exporting, and importing experiments, users can track and manage experiments in restricted environments.
  • Security precautions, such as secure transfer methods and monitoring local storage, are crucial when implementing this workaround.

Read Full Article

like

9 Likes

source image

Towards Data Science

2d

read

305

img
dot

How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals

  • The DeepSeek-R1 model gained attention for its reasoning abilities and cost-efficiency compared to other models.
  • Assessing DeepSeek-R1's reasoning abilities programmatically offers deeper insights.
  • Distilled models from DeepSeek-R1, varying in size, aim to replicate the larger model's performance.
  • Distillation transfers reasoning abilities to smaller, more efficient models for complex tasks.
  • The selection of a distilled model size depends on hardware capabilities and performance needs.
  • Benchmarks like GPQA-Diamond are used to evaluate reasoning capabilities in LLMs.
  • Tools like Ollama and OpenAI's simple-evals assist in evaluating reasoning models.
  • Evaluation results of DeepSeek-R1's distilled model on GPQA-Diamond highlighted some challenges.
  • Setting up Ollama and simple-evals for benchmarking involves specific configurations.
  • Although distilled models may have limitations in complex tasks, they offer opportunities for efficient deployment.

Read Full Article

like

17 Likes

source image

Medium

2d

read

257

img
dot

Image Credit: Medium

Understanding N-Grams in NLP: Capturing Context Beyond Single Words

  • N-grams capture word order and context in text, enhancing language understanding.
  • They are valuable in improving NLP models' performance.
  • Python and scikit-learn make it easy to extract N-grams.
  • N-grams add important context to language structure in NLP models.

Read Full Article

like

15 Likes

source image

Medium

2d

read

93

img
dot

Image Credit: Medium

Writeup: Learning Artificial Intelligence as a Software Engineer

  • As of 2025, the advancement of AI is rapidly evolving with new models like LLMs and multi-modal AI continuously improving.
  • AI developments such as OpenAI's image generation and AI coding surpassing junior developers indicate the growing mainstream presence of AI.
  • Many AI startups are receiving funding, emphasizing the high demand for researchers, engineers, and startups in the AI field.
  • The writer emphasizes the accessibility of learning AI, citing personal experience and encouragement for others to engage in AI education.
  • The writer's chosen method for learning AI is through fast.ai's course, utilizing both video lectures and books for a comprehensive understanding.
  • Initial misconceptions about AI include the broader scope beyond just LLMs or neural networks, and the misconceptions around deep learning and GPU requirements for training data.
  • The fast.ai course introduces practical training such as image classification and model building in Jupyter notebooks, emphasizing hands-on learning with PyTorch.
  • The writer suggests creating a 'playground' for experimentation to understand errors and improve learning processes.
  • Challenges like outdated APIs are addressed by utilizing alternatives like Duck Duck Go for data collection and utilizing platforms like Google Colab and TensorDock for training.
  • The writer plans to continue learning AI, expanding vocabulary, moving on to state-of-the-art papers, and working on practical projects.
  • The experience of learning AI and sharing the journey underscores the importance of continuous learning and knowledge sharing in productive ways.

Read Full Article

like

5 Likes

For uninterrupted reading, download the app