Data Science News, Latest Updates and Recent Announcements on Techminis

A naukri.com initiative

New

Home

Data Science News

VentureBeat

Image Credit: VentureBeat

Astronomer’s $93M raise underscores a new reality: Orchestration is king in AI infrastructure

Astronomer has raised $93 million in Series D funding to enhance their data orchestration platform Astro, powered by Apache Airflow.
The funding round was led by Bain Capital Ventures and included participation from Salesforce Ventures and existing investors like Insight and Venrock.
Astronomer plans to use the funding to accelerate research and development and expand globally, with a focus on Europe, Australia, and New Zealand.
Data orchestration plays a crucial role in overcoming the 'AI implementation gap' in deploying AI initiatives effectively across enterprises.
Astronomer's platform, Astro, based on Apache Airflow, is recognized for its role in orchestrating complex data workflows across various systems.
The company's growth signifies a shift in enterprises' perception of data orchestration from backend infrastructure to a mission-critical technology enabling AI innovation.
Enterprises like Ford Motor Company use Astronomer's platform for AI operations, leveraging its orchestration capabilities for tasks like autonomous driving systems and machine learning operations.
Astronomer's emphasis on scalable orchestration and automation has led to significant customer expansion and adoption rates for their platform in AI and machine learning applications.
The company's recent announcements include the general availability of Airflow 3.0, introducing features tailored for AI workloads and expanding industry partnerships.
Astronomer aims to redefine enterprise data operations by moving towards 'unified DataOps,' integrating observability, quality management, and governance into a single platform.

Read Full Article

VentureBeat

386

Image Credit: VentureBeat

Microsoft launches Phi-4-Reasoning-Plus, a small, powerful, open weights reasoning model!

Microsoft Research has released Phi-4-reasoning-plus, a 14-billion parameter language model designed for tasks requiring deep reasoning.
The model integrates supervised fine-tuning and reinforcement learning to enhance performance in math, science, and logic-based tasks.
Phi-4-reasoning-plus is licensed under MIT, compatible with various frameworks, and aims for high-quality reasoning.
Structured training with special tokens enhances transparency and coherence in problem solving.
The model balances correctness, conciseness, and formatting consistency through reinforcement learning.
Phi-4-reasoning-plus is suited for applications requiring high-quality reasoning under memory or latency constraints.
It supports context lengths up to 64,000 tokens and performs well in chat-like settings.
Despite its smaller size, the model outperforms larger models on benchmarks like the AIME 2025 math exam.
Phi-4-reasoning-plus is positioned as a research tool for generative AI systems and not as a universal solution.
Careful evaluation of performance, safety, and fairness is advised before deploying the model in high-stakes environments.

Read Full Article

23 Likes

Medium

Image Credit: Medium

AI-Powered Video Resolution Technology

AI-Powered Video Resolution Technology can transform low-quality footage into stunning 4K and beyond.
Enhance your videos instantly with AI-driven resolution enhancers that upgrade blurry clips to crystal-clear 4K quality.
AI-powered technology uses advanced algorithms to upscale and enhance footage, achieving high-quality results that mimic professional cameras.
Aside from resolution enhancement, AI video enhancers also reduce noise, stabilize shaky clips, and provide overall improved video quality.

Read Full Article

5 Likes

VentureBeat

166

Image Credit: VentureBeat

Salesforce takes aim at ‘jagged intelligence’ in push for more reliable AI

Salesforce is addressing the challenge of 'jagged intelligence' in AI for business applications, aiming to bridge the gap between raw intelligence and reliable performance in unpredictable enterprise environments.
Salesforce AI Research introduced new benchmarks, models, and frameworks to enhance AI agents' intelligence, trustworthiness, and adaptability for enterprise use.
The focus is on 'Enterprise General Intelligence' (EGI) tailored for business complexity, contrasting with the concept of Artificial General Intelligence (AGI).
A key aspect of the research is measuring and tackling AI's performance inconsistency, highlighted by the introduction of the SIMPLE dataset for evaluating AI systems.
CRMArena, a benchmarking framework simulating real customer relationship management scenarios, aims to fill the gap between academic benchmarks and real-world business needs.
Salesforce announced technical innovations like SFR-Embedding for deeper contextual understanding, xLAM V2 models for action prediction, and SFR-Guard for AI safety and reliability.
ContextualJudgeBench and TACO were also launched, focusing on evaluating judge models in context and multimodal action models for complex problem-solving, respectively.
The emphasis on customer co-innovation and AI reliability aligns with Salesforce's goal of providing dependable AI solutions for enterprises, acknowledging the low tolerance for inaccuracies in enterprise data.
Salesforce's research efforts underscore a shift towards prioritizing consistency and reliability in AI systems for real-world business applications over just raw intelligence.
The company's new technologies are set to roll out gradually, with a strategic focus on enhancing consistency and reliability in AI solutions for businesses.

Read Full Article

9 Likes

Discover more

Medium

279

Image Credit: Medium

How Data Science Shapes Tomorrow’s Daily Life

Future daily life will be shaped by data science through personalized health engines, data-driven home automation systems, and customized educational modules.
AI-driven copilots will optimize professional tasks, while metropolitan transit systems will offer flexible routing based on real-time data.
Ownership models will shift towards distributed micro-fleets, waste systems and power management will be enhanced with IoT and predictive analytics, and data-driven platforms will connect local communities.
Ethical frameworks, explainability modules, data wallets, and privacy methods will become integral as data science becomes deeply embedded in everyday life.

Read Full Article

16 Likes

Nycdatascience

Image Credit: Nycdatascience

Catching Frauds in the Healthcare System

The U.S. Department of Justice (DOJ) estimates healthcare fraud drains about $100 billion annually, approximately 10% of U.S. healthcare spending.
A data enthusiast worked on analyzing historical claim data to predict potentially fraudulent healthcare providers, aiming to reduce fraud costs by billions of dollars.
The analysis was based on Kaggle's Medicare fraud dataset using tools like Python, Pandas, NumPy, Matplotlib, Seaborn, and Scikit-learn.
Fraudulent providers, while only 9.35% of the list, accounted for over half of the total reimbursements in the dataset.
Key features for identifying fraud included total treatment time billed by a provider and the number of claims submitted.
Modeling efforts using Logistic Regression revealed the need for addressing imbalanced data, leading to the adoption of the SMOTE technique for better results.
SMOTE combined with Logistic Regression improved recall to 0.86 and ROC score to 0.9611, emphasizing the importance of catching fraudulent providers even with some false alarms.
Exploration of more models like Random Forest, LightGBM, and real-time fraud detection scenarios are suggested for future work in healthcare fraud detection.
The project highlighted the significance of feature engineering, model tuning, and addressing imbalanced datasets in developing effective fraud detection models.
Looking beyond basic metrics like accuracy is crucial when dealing with real-world, imbalanced datasets to build more meaningful solutions in fraud detection.
The project showcased the importance of critical thinking in problem-solving and the continuous exploration of advanced techniques for fraud detection in the healthcare sector.

Read Full Article

4 Likes

Analyticsindiamag

437

Image Credit: Analyticsindiamag

Satya Nadella Explains Why Microsoft Cancelled Data Centre Projects in US and Europe

Microsoft CEO Satya Nadella mentioned that the company is evaluating demand, workload patterns, and location before committing to data centre projects.
Nadella emphasized the importance of aligning builds and leases with future workload growth, considering factors like AI workloads and global demand.
Microsoft CFO Amy Hood noted that these decisions have long lead times, with land acquisition to build-out stretching from five to seven years.
Despite cancelling some projects, Microsoft continues to aggressively invest in expanding its data centre capacity and plans to increase it in Europe by 40% over the next two years.

Read Full Article

26 Likes

Analyticsindiamag

113

Image Credit: Analyticsindiamag

Axtria’s Agentic AI Platform Delivers Speed, Power, and Freedom of Choice

Axtria Inc. has introduced Axtria InsightsMAx.ai, a powerful platform powered by agentic AI for life sciences organizations.
Axtria InsightsMAx.ai offers a flexible infrastructure with more than 30 life sciences-specific agents, applications, and APIs to address AI implementation challenges.
The platform allows companies freedom of choice in selecting domain-specific AI agents, automating processes, and making data-driven decisions without coding.
Axtria InsightsMAx.ai has already shown significant business outcomes and is operational in top global pharmaceutical companies, offering speed, accuracy, and cost reduction.

Read Full Article

6 Likes

Medium

Image Credit: Medium

Mathematical Fundamentals for Machine Learning and Data Science

Linear algebra, calculus, probability and statistics, information theory, optimization theory, and machine learning algorithms are fundamental mathematical concepts for machine learning and data science.
Key concepts in these mathematical fundamentals include vectors, matrices, derivatives, optimization, probability distributions, entropy, loss functions, and regularization.
Advanced topics such as ensemble methods, gradient boosting, and practical applications like time series analysis, recommendation systems, and natural language processing further enhance the understanding of mathematical integration in machine learning workflows.
Performance metrics and evaluation play a crucial role in assessing machine learning models, with classification metrics, regression metrics, and model comparison aiding in optimizing model performance.

Read Full Article

1 Like

Towards Data Science

381

How Would I Learn to Code with ChatGPT if I Had to Start Again

The article discusses the author's coding journey and various programming approaches learned over time.
Key strategies shared include choosing the right framework, breaking down projects, Googling each step, and systematic coding.
Strategies for troubleshooting bugs are detailed, involving checking documentation, searching on Google/Stack Overflow, intuition, and testing.
The shift in new coders using ChatGPT as a study buddy is noted, along with its benefits and limitations in the coding learning process.
A scenario of debugging a Python script using ChatGPT is presented, highlighting how it can offer personalized guidance and concept clarifications.
The advantages of using ChatGPT in coding, such as efficient troubleshooting and concept clarification, are outlined.
The limitations of relying solely on ChatGPT, including lack of real-time interaction and potential extra debugging time, are discussed.
Recommendations are provided, including leveraging VScode extensions for auto code-completion tasks and balancing ChatGPT assistance with traditional learning methods.
The importance of diverse learning methods, collaboration for problem-solving, and practical coding practice is emphasized.
The author's reflections on the coding journey stress the importance of integrating traditional practices with innovative tools like ChatGPT and auto code-completion tools.
Overall, the article explores the role of ChatGPT in coding education and the need for a balanced approach to learning and problem-solving in programming.

Read Full Article

22 Likes

Medium

180

Image Credit: Medium

Digital Drought: Will LLMs Run Out of Data Before They Reach AGI?

Leading AI researchers and organizations are concerned that AI progress may halt due to a lack of high-quality data for models like ChatGPT and Claude.
Stanford University's AI Index 2025 Report warns about the rapid depletion of internet training data, raising concerns about a digital drought that could impede AI advancement.
MIT's Data Provenance Initiative has observed a significant decrease in available content as publishers and platforms restrict data access for AI companies.
The rising scarcity of data raises the question of whether Large Language Models (LLMs) will exhaust their data sources before achieving the desired advanced capabilities, potentially stalling AI progress.

Read Full Article

10 Likes

Medium

Image Credit: Medium

Gen AI Merchant Matching — using Opensource LLMs, Elastic search and FIASS

Mastercard achieved a 400% improvement in latency, accuracy, and cost-effectiveness using Opensource LLMs, Elastic search, and FIASS.
The project involves the automatic matching of merchant names in financial records to standardized business names to facilitate spendings sorting, financial analysis, fraud detection, and clearer transaction histories for customers.
The project is structured into three stages: parsing messy transaction text, searching for the most likely business match using a hybrid of ElasticSearch and FIASS, and selecting the best match using large LLMs and an 'AI judge.'
Breaking down the problem into clear stages like parsing, hybrid search, and re-ranking has shown that building powerful AI systems is achievable even with modest resources, emphasizing thoughtful design and smart tool utilization over raw compute power.

Read Full Article

2 Likes

Analyticsindiamag

314

Image Credit: Analyticsindiamag

How Digi Yatra is Fixing Airport Travel Without the Privacy Trade-off

Digi Yatra, a digital initiative by the Ministry of Civil Aviation in India, aims to streamline the passenger experience at airports through facial recognition technology while respecting privacy.
CEO Suresh Khadakbhavi leads the Digi Yatra Foundation, which emphasizes Self-Sovereign Identity to ensure user data remains with individuals.
Initially utilizing Aadhaar for identity checks, Digi Yatra transitioned to a selfie-based system, receiving support from Nandan Nilekani.
Digi Yatra Foundation was established in 2019, with operations finally launching at Delhi, Bangalore, and Varanasi airports in December 2020.
The platform uses top-notch facial recognition algorithms for security, employs liveness detection to prevent fraud, and stores data on users' devices, purging it within 24 hours.
Despite initial glitches and privacy concerns, such as those surrounding lighting conditions and facial recognition interference due to crowding, Digi Yatra continues to improve and expand.
Digi Yatra operates on a not-for-profit basis with plans for global expansion and integration with other travel entities like hotels, aiming to create a seamless travel experience.
While critics have raised privacy concerns, Digi Yatra emphasizes user voluntariness, minimal data collection, and adherence to global standards for security and efficiency.
The system has received positive feedback for its efficiency in reducing airport queues and enhancing passenger experience, with plans to scale to 130 airports.
By leveraging advanced technology, like cameras with adjustable angles for better face scanning, Digi Yatra aims to revolutionize airport travel while emphasizing user control over personal data.
Through its innovative approach and commitment to privacy and efficiency, Digi Yatra is paving the way for a smoother, safer, and more convenient airport travel experience in India and potentially worldwide.

Read Full Article

18 Likes

Analyticsindiamag

442

Image Credit: Analyticsindiamag

EXL Posts $501 Million in Q1 2025 Revenue, 53% from AI Services

EXL reported a strong first-quarter revenue of $501.0 million for fiscal 2025, showing a 14.8% increase from Q1 2024.
Over 53% of EXL's Q1 revenue came from data and AI-led services, which grew by 16% year-over-year.
The company launched EXLerate.ai, an AI platform with industry-specific agents, and emphasized talent development with certifications for employees.
EXL raised its full-year 2025 revenue guidance and expressed cautious optimism amidst macroeconomic uncertainties.

Read Full Article

24 Likes

Analyticsindiamag

1.2k

Image Credit: Analyticsindiamag

Cognizant Q1 Net Profit Jumps 21%, Reports 1,400 Generative AI Engagements

Cognizant reported a 21% increase in net profit for Q1, reaching $663 million.
The revenue rose 7.5% YoY, reaching $5.1 billion, driven by growing demand for AI-driven IT services.
The company has engaged in approximately 1,400 early-stage GenAI engagements and is actively scaling its work in generative AI.
Cognizant also announced plans for revenue projections for the second quarter and full-year 2025.

Read Full Article

23 Likes

For uninterrupted reading, download the app