menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Big Data News

Big Data News

source image

Amazon

4w

read

335

img
dot

Image Credit: Amazon

Enhance data governance with enforced metadata rules in Amazon DataZone

  • Amazon DataZone introduces enforced metadata governance to define metadata requirements for data consumers when they request access to data assets.
  • With enforced metadata rules, domain unit owners can establish consistent governance practices across all data subscriptions.
  • This feature simplifies the approval process by guiding data consumers through completing mandatory fields and enabling data owners to make informed decisions.
  • Domain owners can enforce mandatory metadata fields when data consumers request access to data assets, supporting audit and reporting needs.
  • DataZone empowers customers to meet compliance standards, maintain audit readiness, and simplify access workflows for enhanced efficiency and control.
  • Bristol Myers Squibb leverages Amazon DataZone to catalog and enable secure data subscriptions across the organization within governed project environments.
  • Enforced metadata rules benefit multiple stakeholders such as domain owners, data consumers, and data producers.
  • Amazon DataZone enables the creation of custom workflows for assets by capturing critical metadata from data consumers.
  • The solution walkthrough is composed of two parts—an overview and prerequisites for metadata enforcement and the steps to enforce metadata for subscription requests for managed assets and custom assets.
  • Domain owners can apply metadata forms across all asset types or limit them to specific asset types, and choose whether the rule applies to a specific project or all projects within the domain.

Read Full Article

like

20 Likes

source image

Siliconangle

4w

read

8

img
dot

Image Credit: Siliconangle

Empowering AI evolution: How DDN redefines data intelligence for the digital era

  • DataDirect Networks Inc. is a data intelligence firm that transforms raw data into actionable insights.
  • Data intelligence is crucial in the era of artificial intelligence, enabling organizations to leverage data for innovation and competitive advantage.
  • DDN's data intelligence platform facilitates the ingestion, analysis, and delivery of data for AI applications across industries.
  • DDN's partnership with Nvidia and focus on high-performance computing strengthens its role in the AI ecosystem.

Read Full Article

like

Like

source image

Fintechnews

4w

read

328

img
dot

Image Credit: Fintechnews

Temenos to Deliver On-Premises Generative AI for Banks with NVIDIA

  • Banking software provider Temenos has partnered with NVIDIA to offer banks on-premises generative AI solutions.
  • This collaboration enables banks to leverage AI for real-time insights while maintaining data security and control.
  • Temenos' generative AI, running on NVIDIA's accelerated computing platform, allows for rapid processing of information and scaling of AI capabilities.
  • The partnership aims to empower banks to innovate and improve efficiency, productivity, and customer satisfaction in an AI-driven world.

Read Full Article

like

19 Likes

source image

Siliconangle

1M

read

716

img
dot

Image Credit: Siliconangle

Zilliz debuts new release of its cloud-based vector database

  • Zilliz Inc. has debuted a new release of its cloud-based vector database called Zilliz Cloud.
  • Zilliz Cloud is a paid cloud-based version of the open-source vector database called Milvus, offering additional features.
  • The new release of Zilliz Cloud promises up to 10 times faster processing of queries and includes a feature called AutoIndex for automated index generation.
  • Zilliz Cloud supports various types of searches and optimizations for different use cases, making it easier for AI projects.

Read Full Article

like

12 Likes

source image

Amazon

1M

read

55

img
dot

Image Credit: Amazon

Introducing Point in Time queries and SQL/PPL support in Amazon OpenSearch Serverless

  • Amazon OpenSearch Serverless now supports Point in Time (PIT) search, Piped Processing Language (PPL) and Structured Query Language (SQL).
  • PIT search lets you run different queries against a dataset that’s fixed in time. With PIT, you can query against a state of your dataset and maintains a stable sort.
  • PIT search provides superior capabilities and performance because it isn’t bound to a query and supports consistent pagination.
  • Using PIT involves three steps: Create a PIT, run search queries with a PIT ID and use the search_after parameter for the next page of results, and close the PIT.
  • SQL and PPL give new ways to query data and provide flexibility to use format that works best for you. In addition to DSL, you can extract insights out of OpenSearch Serverless using the familiar SQL query syntax.
  • SQL supports complex queries like semi-structured data, set operations, sub-queries with limited joins while PPL queries work by sending queries to the SQL plugin.
  • OpenSearch Serverless is a search and analytics engine that enables you to store, search, and analyze large volumes of data while reducing the burden of manual infrastructure provisioning and scaling.
  • The vector engine for OpenSearch Serverless makes it easy to build modern machine learning (ML) augmented search experiences and generative artificial intelligence (generative AI) applications without managing underlying vector database infrastructure.
  • Keep in mind some limitations when using PIT search and SQL/PPL support. For example, search slicing is not supported in OpenSearch Serverless and the total number of open PITs is restricted to 300 per collection that shares the same AWS KMS key.
  • Amazon OpenSearch Serverless is a robust tool that simplifies data management and enables you to derive actionable insights from data.

Read Full Article

like

3 Likes

source image

Amazon

1M

read

129

img
dot

Image Credit: Amazon

Introducing Amazon MWAA micro environments for Apache Airflow

  • Amazon MWAA introduces mw1.micro, the latest addition to its environment classes offering a cost-effective solution for running Airflow environments in the cloud.
  • mw1.micro configuration provides a balanced set of resources suitable for small-scale data processing and orchestration tasks.
  • mw1.micro environments support up to three concurrent tasks, can accommodate up to 25 DAGs, and particularly well-suited for small production workloads.
  • Customers adopting a federated deployment model can provide isolated environments for different teams or departments and optimize cost with the introduction of mw1.micro.
  • Amazon MWAA and Airflow key functionalities are maintained with the new environment class, including web server, custom plugins and requirements, startup scripts, and Airflow UI.
  • mw1.micro introduces constraints such as only one scheduler/worker container and one Fargate task for the web server.
  • Pricing dimensions for Amazon MWAA remain unchanged with metadata database storage pricing, consumed as pay-as-you-go.
  • Amazon recommends monitoring key metrics such as metadata database memory usage and CPU utilization of the worker/scheduler hybrid container for optimal operation.
  • Setting up an Amazon MWAA micro environment in your account and preferred AWS Region can be done using the AWS Management Console, API, AWS CLI, or automated using AWS CloudFormation, the AWS Cloud Development Kit (AWS CDK), or Terraform scripts.
  • mw1.micro offers a streamlined and economical manner for Airflow power users to manage and orchestrate data pipelines in the cloud.

Read Full Article

like

7 Likes

source image

Siliconangle

1M

read

286

img
dot

Image Credit: Siliconangle

Snowflake to broaden interoperability with Microsoft Dataverse and Dynamics

  • Snowflake expands partnership with Microsoft to support interoperability
  • Bidirectional access enabled between Microsoft's Dataverse and Snowflake AI Data Cloud
  • Connector simplifies interoperability between Microsoft's Power Platform, Dynamics 365, and Snowflake
  • Integration eliminates the need for custom workflows, reducing time spent on resource and infrastructure management

Read Full Article

like

17 Likes

source image

Amazon

1M

read

228

img
dot

Image Credit: Amazon

Integrate custom applications with AWS Lake Formation – Part 1

  • AWS Lake Formation is used to centrally manage and secure data for analysis and machine learning.
  • It allows fine-grained access control and encompasses data security using the AWS Glue Data Catalog.
  • In this two-part series, the article explores how to integrate custom applications or data processing engines with Lake Formation using third-party service integration.
  • The article explains the steps to enforce Lake Formation policies within custom data applications by invoking APIs such as sts:AssumeRole, glue:GetUnfilteredTableMetadata, glue:GetUnfilteredPartitionsMetadata, lakeformation:GetTemporaryGlueTableCredentials, and lakeformation:GetTemporaryGluePartitionCredentials.
  • The article further explains how IAM roles can be used for an external application to access resources in an Lake Formation environment.
  • Lambda serves as an external trusted engine in this instance, and we also explore an architecture diagram that leverages the aforementioned APIs.
  • The article also covers the prerequisites for deploying and testing the solution and provides instructions on how to test the Lambda function.
  • Finally, the article presents additional considerations and concludes by stating that this exploration serves as a foundation for developing your own custom applications or data processing engines that need to operate on an Lake Formation protected data lake.

Read Full Article

like

13 Likes

source image

Amazon

1M

read

89

img
dot

Image Credit: Amazon

Integrate custom applications with AWS Lake Formation – Part 2

  • This post explores how to deploy a fully functional web client application, built with JavaScript/React through AWS Amplify that uses an AWS Lambda Python function as the backend.
  • The provisioned web application provides a user-friendly and intuitive way to view the Lake Formation policies that have been enforced.
  • The article provides a solution overview along with prerequisites for setting up the application.
  • The application backend is based on a GraphQL API with resolvers implemented as a Python Lambda function.
  • There are several controls in the application like selecting a database, number of records, row type, and table.
  • The application also has four outputs organized in tabs, including Unfiltered Table Metadata, Unfiltered Partitions Metadata, Authorized Data, and All Data.
  • You can use the AWS AppSync GraphQL API deployed in this post for other applications.
  • The authors provide detailed queries along with input parameters and all fields defined in the schema as output.
  • In conclusion, this post provides a comprehensive understanding of how to extend the capabilities of Lake Formation by building and integrating your custom data processing applications.
  • The post is authored by Stefano Sandonà, a Senior Big Data Specialist Solution Architect at AWS, and Francesco Marelli, a Principal Solutions Architect at AWS.

Read Full Article

like

5 Likes

source image

Amazon

1M

read

165

img
dot

Image Credit: Amazon

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

  • Generative AI is finding increasing use in document search.
  • Organisations deploying these tools are concerned that only authenticated and authorised personnel can access documentation.
  • A solution involving documented metadata, Amazon OpenSearch Service and Amazon Cognito allows organisations to control document access on a customised, role-based access mechanism.
  • OpenSearch Service stores document files as numeric values, with similarity searches made through querying the similarity between a query vector and existing vectors stored in the database to return documents.
  • Despite the need to make documents available to various employees in manufacturing, HR, finance and sales, there are sometimes specific limits required, and access must be restricted.
  • Documentation access can be controlled by various factors such as access level and department, to ensure documents with confidential information or customer strategy details are only viewable by those with authorisation.
  • Restricting access to documentation which is controlled by metadata and OpenSearch Service gives a simpler approach to access management than traditional multi-tenant document securities, and provides a way to access business-specific content quickly.
  • Deployment of the solution utilises the AWS CDK to access AWS Amplify, embedded in which is Amazon Cognito and the OpenSearch Service.
  • For Rolls Royce's example, the search mechanism was able to handle queries directly on its 10-terabyte unstructured document corpus in minutes.
  • With the AWS CDK, most resources can be tidied up piecemeal when the solution is no longer needed.

Read Full Article

like

9 Likes

source image

Siliconangle

1M

read

120

img
dot

Image Credit: Siliconangle

Microsoft Fabric gets enhanced AI development and data integration features

  • Microsoft Fabric is getting new tools and capabilities to streamline artificial intelligence application development, enhance data integration, and improve security.
  • Fabric Databases, now in preview, provide simpler database provisioning with autonomous functions and allow data to be automatically replicated to Fabric's integrated OneLake data lake.
  • New AI capabilities within Fabric, such as AI Functions and Azure AI Agent Service Integration, make generative AI more accessible and enable simplified connection to enterprise data sources.
  • OneLake Catalog, now generally available, provides a unified platform for exploring, managing, and governing Microsoft Fabric data and will be enhanced with new features for data quality, labeling, and compliance.

Read Full Article

like

7 Likes

source image

TechBullion

1M

read

353

img
dot

Image Credit: TechBullion

Investment Research Tools: Gaining Insights with AI and Data Analytics

  • Investors have been using AI and data analytics tools to make smarter investment decisions with better risk management.
  • AI's ability to analyse large datasets and uncover patterns has led to the development of powerful tools for investment research.
  • Predictive analytics proves to be one of the most valuable applications of AI in investment research.
  • Investors can gauge public sentiment by using NLP to analyze news articles, social media posts, and earnings reports.
  • Big data is one of the most critical aspects of modern investment research as data analytics can help identify emerging market trends and opportunities.
  • Quantitative analysis using mathematical models and algorithms helps in analyzing financial data.
  • AI and data analytics have improved the accuracy of decision making, enhanced efficiency, and reduced risk in investment.
  • AI-powered tools help investors to react to changes in the market in real-time, ensuring they don’t miss out on profitable opportunities.
  • With predictive analytics and NLP, investors can avoid making emotional decisions driven by fear or greed.
  • AI and data analytics also help investment managers optimize their portfolios by analyzing factors like asset allocation, risk tolerance, and performance metrics.

Read Full Article

like

21 Likes

source image

Siliconangle

1M

read

165

img
dot

Image Credit: Siliconangle

Empowering developers: How T-Mobile leverages Kubernetes for AI and machine vision

  • T-Mobile leverages Kubernetes for AI and machine vision to drive efficiency and developer autonomy.
  • The majority of T-Mobile's applications run within Kubernetes, providing efficiency and scalability.
  • Kubernetes is transforming how enterprises manage workloads, allowing developers to focus on innovation.
  • Portworx by Pure Storage supports T-Mobile's cloud-native data needs, including AI and machine learning models.

Read Full Article

like

9 Likes

source image

Precisely

1M

read

309

img
dot

Image Credit: Precisely

Expert Insights for Your 2025 Data, Analytics, and AI Initiatives

  • Data integrity is crucial for AI initiatives and better decision-making, but data trust is declining across organizations.
  • Data ecosystem complexity, pervasives cloud modernization, and viewing data as a product all lead to a decline in data confidence.
  • Successful AI implementations rely on data governance, quality, and overall integrity.
  • To find success with AI initiatives, organizations need to upskill their employees and create a data-driven culture.
  • Improving data quality and governance is key, with metadata management and behavior-based data observability being crucial focus areas.
  • Businesses are increasingly recognizing the value of location-based insights as a source of competitive advantage.
  • Enhancing data with location intelligence can help businesses make more informed decisions in areas such as site selection, last-mile delivery optimization, and risk assessment.
  • To stay ahead of the curve, businesses must focus on building a long-term, disciplined approach to their data strategy, and investing in next-generation technologies.
  • Data integrity is a top priority for 76% of organizations surveyed, with data-driven decision-making as a top goal for their data programs.
  • Businesses also need to identify relevant data sets for AI training and inference to prevent introducing harmful biases into their models.

Read Full Article

like

18 Likes

source image

Amazon

1M

read

301

img
dot

Image Credit: Amazon

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

  • Generating metadata for data assets is time-consuming but can be automated using generative AI. Amazon Bedrock offers a choice of high-performing FMs such as AI21 Labs, Anthropic, Cohere, etc. for metadata generation. In this solution, the AWS Glue Data Catalog is enriched with dynamic metadata using foundation models (FMs) on Amazon Bedrock and data documentation.
  • This post shows two approaches to generate descriptive metadata for tables in the Data Catalog with two different generative AI models available in Amazon Bedrock: In-context learning and Retrieval Augmented Generation (RAG).
  • In the in-context learning approach, a model generates the metadata descriptions without documentation, whereas the RAG approach uses external documentation to generate richer and more accurate metadata.
  • After ingesting data from a public Amazon Simple Storage Service (S3), the approach was first deployed to a small database.
  • The metadata generation process involves chunking the content of an HTML page of data documentation, generating and storing vector embeddings for data documentation, informing the model which information to generate by providing instructions, sending the promt to the model, and updating the table metadata in the Data Catalog.
  • The approaches demonstrated showcase the flexibility and versatility of this solution. Employing generative AI to improve and add metadata to existing data assets unlocks new levels of data intelligence, empowering your organization to make more informed decisions and drive data-driven innovation and unlock the full value of your data.

Read Full Article

like

18 Likes

For uninterrupted reading, download the app