Knowledge Graph Archives - DBpedia Association https://www.dbpedia.org/blog/tag/knowledge-graph/ Global and Unified Access to Knowledge Graphs Thu, 09 Mar 2023 13:00:46 +0000 en-GB hourly 1 https://wordpress.org/?v=6.4.3 https://www.dbpedia.org/wp-content/uploads/2020/09/cropped-dbpedia-webicon-32x32.png Knowledge Graph Archives - DBpedia Association https://www.dbpedia.org/blog/tag/knowledge-graph/ 32 32 GSoC2022 – Call for Contributors https://www.dbpedia.org/blog/gsoc2022/ Fri, 11 Mar 2022 11:49:46 +0000 https://www.dbpedia.org/?p=5216 Pinky: Gee, Brain, what are we gonna do this year?Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2022. For the 11th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to […]

The post GSoC2022 – Call for Contributors appeared first on DBpedia Association.

]]>
Pinky: Gee, Brain, what are we gonna do this year?
Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2022.

For the 11th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to work with open-source organizations like DBpedia

So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Even though Covid-19 changed a lot in the world, it couldn’t shake Google Summer of Code (GSoC) much. The program, designed to mentor youngsters from afar is almost too perfect for us. One of the advantages of GSoC is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us.

DBpedia is now looking for contributors who want to work with us during the upcoming summer months.  

What is Google Summer of Code?

Google Summer of Code is a global program focused on bringing developers into open source software development. Funds will be given to all new beginner contributors to open source over 18 years to work for two and a half months (or longer) on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2022.

And this is how it works …

Step 1Check out one of our projects here or draft your own. 
Step 2Get in touch with our mentors as soon as possible and write up a project proposal of at least 8 pages. Information about our proposal structure and a template are available here.  
Step 3After a selection phase, contributors are matched with a specific project and mentor(s) and start working on the project. 

Application Procedure

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2022. Please also note the official GSoC 2022 timeline for your proposal submission and make sure to submit on time. Unfortunately, extensions cannot be granted. Final submission deadline is April 19, 2022 at 18:00 UTC.

Contact

Detailed information on how to apply are available on the DBpedia website. We’ve prepared an information kit for you. Please find all necessary information regarding the student application procedure here.

And in case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Finally, we are looking forward to your contribution!

Yours DBpedia Association

The post GSoC2022 – Call for Contributors appeared first on DBpedia Association.

]]>
How Innovative Organizations Use The World’s Largest Knowledge Graph https://www.dbpedia.org/blog/how-innovative-organizations-use-the-worlds-largest-knowledge-graph/ Tue, 09 Nov 2021 12:50:53 +0000 https://www.dbpedia.org/?p=5033 DBpedia Member Features – Over the last year we gave DBpedia members multiple chances to present their work, tools and applications. In this way, our members gave exclusive insights on the DBpedia blog. This time we will continue with Diffbot, a California-based company whose mission is to “extract knowledge in an automated way from documents.” They […]

The post How Innovative Organizations Use The World’s Largest Knowledge Graph appeared first on DBpedia Association.

]]>
DBpedia Member Features – Over the last year we gave DBpedia members multiple chances to present their work, tools and applications. In this way, our members gave exclusive insights on the DBpedia blog. This time we will continue with Diffbot, a California-based company whose mission is to “extract knowledge in an automated way from documents.” They will introduce the Diffbot Knowledge Graph and present topics, like Market Intelligence and Ecommerce. Have fun reading!

by Filipe Mesquita & Merrill Cook, Diffbot

Diffbot is on a mission to create a knowledge graph of the entire public web. We are teaching a robot, affectionately known as Diffy, to read the web like a human and translate its contents into a format that (other perhaps less sophisticated) machines can understand. All of this information is linked and cleaned on a continuous basis to populate the Diffbot Knowledge Graph.

The Diffbot Knowledge Graph already contains billions of entities, including over 240M organizations, 700M people, 140M products, and 1.6B news articles. This scale is only possible because Diffy is fully autonomous and doesn’t depend on humans to build the Diffbot Knowledge Graph. Using cutting-edge crawling technology, natural language processing, and computer vision, Diffy is able to read and extract facts from across the entire web.

While we believe a knowledge graph like Diffbot’s will be used by virtually every organization one day, there are 4 use cases where the Diffbot Knowledge Graph excels today: (1) Market Intelligence, (2) News Monitoring, (3) E-commerce, and (4) Machine learning.

Market Intelligence

Video: https://www.diffbot.com/assets/video/solutions-for-media-monitoring.mp4

At its simplest, market intelligence is the generation of insights about participants in a market. These can include customers, suppliers, competitors, as well as attitudes of the general public and political establishment.

While market intelligence data is all over the public web, this can be a “double-edged sword.” The range of potential sources for market intelligence data can exhaust the resources of even large teams performing manual fact accumulation.

Diffbot’s automated web data extraction eliminates the inefficiencies of manual fact gathering. Without such automation, it’s simply not possible to monitor everything about a company across the web.

We see market intelligence as one of the most well-developed use cases for the Diffbot Knowledge Graph. Here’s why:

  • The Diffbot Knowledge Graph is built around organizations, people, news articles, products, and the relationships among them. These are the types of facts that matter in market intelligence.
  • Knowledge graphs have flexible schemas, allowing for new fact types to be added “on the fly” as the things we care about in the world change
  • Knowledge graphs provide unique identifiers for all entities, supporting the disambiguation entities like Apple (the company) vs apple (the fruit).

Market intelligence uses from our customers include:

  • Querying the Knowledge Graph for companies that fit certain criteria (size, revenue, industry, location) rather than manually searching for them in Google
  • Creating dashboards to receive insights about companies in a certain industry
  • Improving an internal database by using the data from the Diffbot Knowledge Graph.
  • Custom solutions that incorporate multiple Diffbot products (custom web crawling, natural language processing, and Knowledge Graph data consumption)

News Monitoring

Sure, the news is all around us. But most companies are overwhelmed by the sheer amount of information produced every day that can impact their business.

The challenges faced by those trying to perform news monitoring on unstructured article data are numerous. Articles are structured differently across the web, making aggregation of diverse sources difficult. Many sources and aggregators silo their news by geographic location or language.

Strengths of providing article data through a wider Knowledge Graph include the ability to link articles to the entities (people, organizations, locations, etc) mentioned in each article. Additional natural language processing includes the ability to identify quotes and who said them as well as the sentiment of the article author towards each entity mentioned in the article.

In high-velocity, socially-fueled media, the need for automated analysis of information in textual form is even more pressing. Among the many applications of our technology, Diffbot is helping anti-bias and misinformation initiatives with partnerships involving FactMata as well as the European Journalism Centre.

Check out how easy it is to build your own custom pan-lingual news feed in our news feed builder.

Ecommerce

Many of the largest names in ecommerce have utilized Diffbot’s ability to transform unstructured product, review, and discussion data into valuable ecommerce intelligence. Whether pointing AI-enabled crawlers at their own marketplaces to detect fraudulent, duplicate, or underperforming products, or by analyzing competitor or supplier product listings.

One of the benefits of utilizing Diffbot’s AI-enabled product API or our product entities within the Knowledge Graph is the difficulty of scraping product data at scale. Many ecommerce sites employ active measures to make the scraping of their pages at scale difficult. We’ve already built out the infrastructure and can begin returning product data at scale in minutes.

The use of rule-based scraping by many competitors or in-house teams means that whenever ecommerce sites shift their layout or you try to extract ecommerce web data from a new location, your extraction method is likely to break. Additionally, hidden or toggleable fields on many ecommerce pages are more easily extracted by solutions with strong machine vision capabilities.

Diffbot’s decade-long focus on natural language processing also allows the inclusion of rich discussion data parsed for entities, connections, and sentiment. On large ecommerce sites, the structuring and additional processing of review data can be a large feat and provide high value.

Machine Learning

Even when you can get your hands on the right raw data to train machine learning models, cleaning and labeling the data can be a costly process. To help with this, Diffbot’s Knowledge Graph provides potentially the largest selection of once unstructured web data, complete with data provenance and confidence scores for each fact.

Our customers use a wide range of web data to quickly and accurately train models on diverse data types. Need highly informal text input from reviews? Video data in a particular language? Product or firmographic data? It’s all in the Knowledge Graph, structured and with API access so customers can quickly jump into validating new models.

With a long association with Stanford University and many research partnerships, Diffbot’s experts in web-scale machine learning work in tandem with many customers to create custom solutions and mutually beneficial partnerships.

To some, 2020 was the year of the knowledge graph. And while innovative organizations have long seen the benefits of graph databases, recent developments in the speed of fact accumulation online mean the future of graphs has never been more bright.

A big thank you to Diffbot, especially to Filipe Mesquita and Merrill Cook for presenting the Diffbot Knowledge Graph.  

Yours,

DBpedia Association

The post How Innovative Organizations Use The World’s Largest Knowledge Graph appeared first on DBpedia Association.

]]>
Recap: Google Summer of Code 2021 https://www.dbpedia.org/blog/recap-google-summer-of-code-2021/ Wed, 08 Sep 2021 11:37:30 +0000 https://www.dbpedia.org/?p=4949 We received 26 project proposals for this Google Summer of Code (GSoC) edition. For the 10th year in a row, we were part of this incredible journey of young ambitious developers who joined us as an open-source organization to work on a Google Summer of Code project.  Each year has brought us new project ideas, […]

The post Recap: Google Summer of Code 2021 appeared first on DBpedia Association.

]]>
We received 26 project proposals for this Google Summer of Code (GSoC) edition.
students and mentors during a Google Summer of Code video call
Video Call with all GSoC students and mentors

For the 10th year in a row, we were part of this incredible journey of young ambitious developers who joined us as an open-source organization to work on a Google Summer of Code project

Each year has brought us new project ideas, many amazing students and mostly great project results that shaped the future of DBpedia. 

One of the advantages of Google Summer of Code is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us – DBpedia. 

Meet our Google Summer of Code students and their projects

Throughout the summer program, our ten finalists worked intensely on their challenging DBpedia projects with great outcomes to show to the public. Projects ranged from extending a neural extraction framework to creating a DBpedia Chatbot as well as creating a dashboard for DBpedia Spotlight. If you want to have deeper insights into our GSoC student’s work you can find their blogs and repos in the following list. Check them out! 

Thanks to mentors

Thanks to all our mentors around the world for joining us in this endeavour, for mentoring with kindness and technical expertise. A huge shout out to those who have been by our side for so many years in a row. Many thanks to Tommaso Soru, Beyza Yaman, Diego Moussalem, Ricardo Usbeck, Edgard Marx, Marianno Rico, Thiago Castro Ferreira, Luca Virgili, Ram G Athreya, as well as Sebastian Hellmann, Nausheen Fatma, Said P. Martagon, Krishanu Konar, Zheyuan Bai, Julio Hernandez, Anand Panchbhai, and Jan Forberg. We would also like to thank Andreas Both, Aleksandr Perevalov, Lahiru Hinguruduwa, Marvin Hofer, Maribel Angelica Marin Castro, and Alex Winter, who were mentors for the first time this year. Thank you all again for spending over 3.5+ months working with this year’s GSoC students and helping them become better open source contributors!

Mentor Summit

During the previous years you might have noticed that we always organized a little lottery to decide which mentor or organization admin can join the annual GSoC mentor summit. As this year’s event will be held online, space is open to all organization admins and mentors alike. The GSoC Virtual Mentor Summit takes place on November 4, 2021 and this year we hope all our mentors will find the time to join and exchange with fellow mentors from around dozens of open source projects. 

After GSoC is before the next GSoC

We can not wait for the 2022 edition. Likewise, if you are an ambitious student who is interested in open source development and working with DBpedia you are more than welcome to either contribute your own project idea or apply for project ideas we offer starting in early 2022. If you would like to know where previous mentors and students are now working, please read our last GSoC blog post

In case you like to mentor a project do not hesitate to also get in touch with us via dbpedia@infai.org

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Julia

on behalf of the DBpedia Association

The post Recap: Google Summer of Code 2021 appeared first on DBpedia Association.

]]>
Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/ https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/#respond Tue, 11 May 2021 08:45:28 +0000 https://www.dbpedia.org/?p=4575 On Tuesday the 4th of May, DBpedia organized a tutorial at the Knowledge Graph Conference (KGC) 2021. The ultimate goal of the tutorial was to teach the participants all relevant tech around DBpedia, the knowledge graph, the infrastructure and possible use cases. The tutorial aimed at existing and potential new users of DBpedia, developers that […]

The post Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 appeared first on DBpedia Association.

]]>
On Tuesday the 4th of May, DBpedia organized a tutorial at the Knowledge Graph Conference (KGC) 2021. The ultimate goal of the tutorial was to teach the participants all relevant tech around DBpedia, the knowledge graph, the infrastructure and possible use cases. The tutorial aimed at existing and potential new users of DBpedia, developers that wish to learn how to replicate DBpedia infrastructure, service providers, data providers as well as data scientists.

Following, we will give you a brief retrospective about the presentations. For further details of the presentations follow the link to the slides.

Opening

The tutorial which was held online was opened by Milan Dojchinovski (InfAI / DBpedia Association /  CTU in Prague) with some general information about the program of the tutorial, the scope and the technical information.

DBpedia in a Nutshell session

After the short opening, Milan continued with the first topic, the background on the DBpedia Association – how it all started and the evolution of DBpedia. Linked Data and the LOD cloud were also addressed as well as the mappings, extractors and data groups (e.g. mappings, generic, text, wikidata). Then Ontology was presented and explained. Milan concluded the first topic with information on the DBpedia SPARQL endpoint and DBpedia Databus platform.

Getting Started with DBpedia session

The next point on the program was split into two subtopics. First of all, Jan Forberg (InfAI / DBpedia Association) explained where to find data including DBpedia SPARQL endpoint, the DBpedia Databus platform as a repository for DBpedia and related datasets and the novel “collections” concept. Moreover, the DBpedia services such as DBpedia Lookup and DBpedia Spotlight were presented.

Afterwards Jan explained how to use the data hosted on the Databus. Starting by selecting particular artifacts, he explained the Docker container where data can be downloaded and a simple bash script to submit SPARQL and retrieve specific data artifacts.

Building National Knowledge Graphs using DBpedia Tech

In the following session, Johannes Frey (InfAI / DBpedia Association) explied how to build national knowledge graphs using DBpedia Tech. The use case of the Dutch National Knowledge Graph was explained as an example. The Dutch National Knowledge Graph was presented during the DBpedia Hackathon 2020. For further information feel free to have a look at the presentations of the Hackathon 2020 here https://tinyurl.com/kgia-2020-dnkg.  Please also see all relevant data here https://databus.dbpedia.org/dnkg/fusion/dutch-national-kg/

DBpedia Technology Stack

Talking about DBpedia Technology Stack, Jan started with the DBpedia Databus platform. He explained how the Databus platform works, the benefits (DatalDs and Simple Retrieval), the Databus SPARQL endpoints and the Web API and Maven Plugin. After that, Jan presented the Dockersized Services including DBpedia Virtuoso and DBpedia Plugin, DBpedia Spotlight (incl. use cases) and DBpedia Lookup.

Marvin Hofer (InfAI / DBpedia Association) then explained the DBpedia release process on the Databus and presented his work on debugging DBpedia and the DBpedia Mods technology. Marvin also explained the quality assurance process using the concept of  minidumps.

Afterwards, Johannes explained (Pre)fusion, ID management and the novel concept of cartridges.

Subsequently, Denis Streitmatter (InfAI / DBpedia Association) presented the DBpedia Archivo ontology manager and how to include your ontology here. He also explained various  use cases, e.g. how to find ontology, how to test your ontology and how to back it up. Then he presented the ontology tests 5 star schema and the SHACL based tests for ontologies. Please read the official DBpedia Archivo call here https://www.dbpedia.org/blog/dbpedia-archivo-call-to-improve-the-web-of-ontologies/

Contributions to DBpedia

As it got to the end of the tutorial, Milan explained how to improve mappings or introduce new mappings. He talked about improvement of the DBpedia Information Extraction Framework as well as contributing DBpedia tests. Then he explained about contributing mappings and links for knowledge cartridges and how to write Mods for the Databus.

In case you missed the event, our presentation is also available on the DBpeda event page. Further insights, feedback and photos about the event are available on Twitter (#DBpediaTutorial hashtag).

We are now looking forward to the next DBpedia tutorial, which will be held on September 1, 2021 co-located with the LDK conference in Zaragoza, Spain. Check more details here and register now! Furthermore, we will organize the DBpedia Day on September 9, 2021 at the Semantics Conference in Amsterdam. We are looking forward to meeting all Dutch DBpedians there! 

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Yours DBpedia Association

The post Wrap Up: DBpedia Tech Tutorial @ Knowledge Graph Conference 2021 appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/dbpedia-tutorial-kgc-2021/feed/ 0
ContextMinds: Concept mapping supported by DBpedia https://www.dbpedia.org/blog/contextminds/ https://www.dbpedia.org/blog/contextminds/#respond Fri, 16 Apr 2021 08:51:11 +0000 https://www.dbpedia.org/?p=4498 Contribution from Marek Dudáš (Prague University of Economics and Business – VŠE) ContextMinds is a tool that combines two ideas: concept mapping and knowledge graphs. What’s concept mapping? With a bit of simplification, when you take a small subgraph of not more than a few tens of nodes from a knowledge graph (kg) and visualize it with the classic node-link (or “bubbles and arrows”) approach, you get a concept map. But […]

The post ContextMinds: Concept mapping supported by DBpedia appeared first on DBpedia Association.

]]>
Contribution from Marek Dudáš (Prague University of Economics and Business – VŠE)

ContextMinds is a tool that combines two ideas: concept mapping and knowledge graphs. What’s concept mapping? With a bit of simplification, when you take a small subgraph of not more than a few tens of nodes from a knowledge graph (kg) and visualize it with the classic node-link (or “bubbles and arrows”) approach, you get a concept map. But concept maps are much older than knowledge graphs. They emerged in the 70’s and were originally intended to be created by hand. This was done to represent a person’s understanding of a given problem or question. Shortly after their “discovery” (using diagrams to represent relationships is probably much older idea), they turned out to be a very useful educational tool. 

Going back to knowledge graphs and DBpedia, ContextMinds lets you quickly create an overview of some problem you need to solve, study or explain.  

Figure 1 Text search in concepts from DBpedia: starting point of concept map creation in ContextMinds. 

How you can start 

Starting from a classic text search, you select concepts (nodes) from a knowledge graph, ContextMinds shows how they are related (loads the links from the knowledge graph). It also suggest you what other concepts are there in the kg that you might be interested in. The suggestions are brought from the joint neighborhood of the nodes you already selected and put into the view. Nodes are scored by relevance, basically by the number of links to what you have in the view. So, as you are creating your concept map, an always updated list of around 30 most related concepts is available for simple drag & dropping to your map.  

Figure 2 Concept map and a list of top related concepts found in DBpedia by ContextMinds. 

This helps you make the concept map complete quickly. It also helps to discover relationships between the concepts that you were not aware of. If a concept or relationship is not there yet in the knowledge graph, you can create it. It will not only appear in your concept map, but will also become a part of an extended knowledge for anyone who has access to your map. You can at any time select the sources of concept & relationship suggestions. To do that you can choose any combination of the personal scope (concepts from maps created by you), workspace scope (shared space with teammates), DBpedia (or a different kg) and public scope (everything created by the community and made public). 

The best way of explaining how it works is a short video.

Use Case: Knowledge Graph 

ContextMinds was of course built with DBpedia as the initial knowledge graph. That instance is available at app.contextminds.com and more than 100 schools are using it as an educational aid. Recently, we discovered that the same model can be useful with other knowledge graphs. 

Say you run some machine learning that helps you identify some objects in the knowledge graph as having some interesting properties. Now you might need to look at what is there in the graph about them to either explain the results or show the results to domain experts so that they can use them for further research. And that is where ContextMinds comes in. You put the concepts from the machine learning results into the view and ContextMinds automatically adds the links between them and finds related concepts from their neighborhood. We have done this with kg-covid, a knowledge graph built from various biomedical and Covid-related datasets. There we use RDFrules to mine interesting rules and then visualize the results in ContextMinds (available at contextminds.vse.cz). Because of that biology experts may interpret them and explore further related information. More about that maybe later in another blogpost.

Our Vision 

An additional fun fact: since we started developing ContextMinds to work solely with DBpedia, its data model is kind-of hard-coded in it. Although the plan is to enable loading multiple knowledge graphs into single ContextMinds instance so that the user may interconnect objects from DBpedia with those from other datasets when creating the concept map at the moment we have to transform the data so that they look like DBpedia to be loaded into ContextMinds. 

A big thank you to ContextMinds, especially Marek Dudáš for presenting how ContextMinds combines concept mapping and knowledge graphs.

Yours,

DBpedia Association

The post ContextMinds: Concept mapping supported by DBpedia appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/contextminds/feed/ 0
GSoC2021 – Call for Students https://www.dbpedia.org/blog/gsoc2021/ https://www.dbpedia.org/blog/gsoc2021/#respond Wed, 17 Mar 2021 12:45:41 +0000 https://www.dbpedia.org/?p=4314 Pinky: Gee, Brain, what are we gonna do this year?Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2021. For the 10th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to […]

The post GSoC2021 – Call for Students appeared first on DBpedia Association.

]]>
Pinky: Gee, Brain, what are we gonna do this year?
Brain: Wear a mask, keep our distance, and do the same thing we do every year, Pinky. Taking over GSoC2021.

For the 10th year in a row, we have been accepted to be part of this incredible program to support young ambitious developers who want to work with open-source organizations like DBpedia

So far, each year has brought us new project ideas, many amazing students and great project results that shaped the future of DBpedia. Even though Covid-19 changed a lot in the world, it couldn’t shake Google Summer of Code (GSoC) much. The program, designed to mentor youngsters from afar is almost too perfect for us. One of the advantages of GSoC is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us . 

DBpedia is now looking for students who want to work with us during the upcoming summer months.  

What is Google Summer of Code?

Google Summer of Code is a global program focused on bringing student developers into open source software development. Funds will be given to students (BSc, MSc, PhD.) to work for two and a half months on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2021.

And this is how it works …

Step 1Check out one of our projects here or draft your own. 
Step 2Get in touch with our mentors as soon as possible and write up a project proposal of at least 8 pages. Information about our proposal structure and a template are available here.  
Step 3After a selection phase, students are matched with a specific project and mentor(s) and start working on the project. 

Application Procedure

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2021. Please also note the official GSoC 2020 timeline for your proposal submission and make sure to submit on time.  Unfortunately, extensions cannot be granted. Final submission deadline is April 13, 2021 at 8 pm, CEST.

Contact

Detailed information on how to apply are available on the DBpedia Website. We’ve prepared an information kit for you. Please find all necessary information regarding the student application procedure here.

And in case you still have questions, please do not hesitate to contact us via dbpedia@infai.org.

Stay safe and check Twitter or LinkedIn. Furthermore, you can subscribe to our Newsletter for the latest news and information around DBpedia.

Finally, we are looking forward to your contribution!

Yours DBpedia Association

The post GSoC2021 – Call for Students appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/gsoc2021/feed/ 0
Why Data Centricity Is Key To Digital Transformation https://www.dbpedia.org/blog/why-data-centricity-is-key-to-digital-transformation/ https://www.dbpedia.org/blog/why-data-centricity-is-key-to-digital-transformation/#respond Wed, 17 Feb 2021 09:01:17 +0000 https://www.dbpedia.org/?p=3988 DBpedia Member Features – Last year we gave DBpedia members the chance to present special products, tools and applications on the DBpedia blog. We have already published several posts in which our members provided unique insights. This week we will continue with eccenca. They will explain why companies struggle with digital transformation and why data […]

The post Why Data Centricity Is Key To Digital Transformation appeared first on DBpedia Association.

]]>
DBpedia Member Features – Last year we gave DBpedia members the chance to present special products, tools and applications on the DBpedia blog. We have already published several posts in which our members provided unique insights. This week we will continue with eccenca. They will explain why companies struggle with digital transformation and why data centricity is the key to this transformation. Have fun while reading!

by Hans-Christian Brockmann, CEO eccenca

Why Data Centricity Is Key To Digital Transformation

Only a few large enterprises like Google, Amazon, and Uber have made the mindset and capability transition to turn data and knowledge into a strategic advantage. They have one thing in common: their roadmap is built on data-centric principles (and yes, they use knowledge graph technology)!


Over the last years it has become obvious that the majority of companies fail at digital transformation as long as they continue to follow their outdated IT management best practices. We, the knowledge graph community, have long been reacting to this with rather technical explanations about RDF and ontologies. While our arguments have been right at all times, they did not really address the elephant in the room: that it’s not only a technological issue but a question of mindset.


Commonly, IT management is stuck with application-centric principles. Solutions for a particular problem (e.g. financial transactions, data governance, GDPR compliance, customer relationship management) are thought about in singular applications. This has created a plethora of stand-alone applications in companies which store and process interrelated or even identical
data but are unable to integrate. That’s because every application has its own schema and data semantics. And companies have hundreds or even thousands different applications at work. Still, when talking about data integration projects or digital transformation the IT management starts the argument from an application point of view.

mastering complexity

Companies Struggle With Digital Transformation Because Of Application Centricity

This application-centric mindset has created an IT quagmire. It prevents automation and digital
transformation because of three main shortcomings.

  1. Data IDs are local. The identification of data is restricted to its source application which prevents global identification, access and (re)use.
  2. Data semantics are local. The meaning of data, information about their constraints, rules and context are hidden either in the software code or in the user’s head. This makes it difficult to work cooperatively with data and also hinders automation of data-driven processes.
  3. The knowledge about data’s logic is IT turf. Business users who actually need this knowledge to scale their operations and develop their business in an agile way are always dependent on an overworked IT which knows the technicalities but doesn’t understand the business context and needs. Thus, scalability and agility are prevented from the start.

Data centricity changes this perspective because it puts data before applications. Moreover, it simplifies data management. The term was coined by author and IT veteran Dave McComb. The aim of data centricity is to “base all application functionality on a single, simple, extensible and federateable data model”, as Dave recently outlined in the latest Escape From Data Darkness
webcast episodes. At first, this might sound like advocating yet another one of these US$ 1bn data integration / consolidation projects done by a big name software vendor, the likes of which have failed over and over again. Alas, it’s quite the opposite.

A Central Data Hub For Knowledge Driven Automation

Data centricity does not strive to exchange the existing IT infrastructure with just another proprietary application. Data centricity embraces the open-world assumption and agility concepts and thus natively plays well with the rest of the data universe. The application-centric mindset always struggles with questions of integration, consolidation and a religious commitment to being the “single source of truth”. The data-centric mindset does not have to, because integration is (no pun intended) an integral part of the system. Or as Dave puts it in his book “The Data-Centric Revolution”: “In the Data-Centric approach […] integration is far simpler within a domain and across domains [because] it is not reliant on mastering a complex schema. […] In the Data-Centric approach, all identifiers (all keys) are globally unique. Because of this, the system integrates information for you. Everything relating to an entity is already connected to that entity” without having to even consolidate it in a central silo.


Of course, this sounds exactly like what we have been talking about all those years with knowledge graph technology and FAIR data. And we have seen it working beautifully with our customers like Nokia, Siemens, Daimler and Bosch. eccenca Corporate Memory has provided them with a central data hub for their enterprise information that digitalizes expert knowledge,
connects disparate data and makes it accessible to both machines and humans. Still, what we have learned from those projects is this: Conviction comes before technology, just as data comes before the application. Knowledge graph technology certainly is the key maker to digital transformation. But a data-centric mindset is key.

A big thank you to eccenca, especially Hans-Christian Brockmann for explainig why data centricity is the key to digital transformation. Four years ago eccenca became a member of the DBpedia Association and helped to increase the DBpedia network. Thanks for your contribution and constant support! Feel free to check out eccenca’s member presentation page: https://www.dbpedia.org/dbpedia-members/eccenca/

Yours,

DBpedia Association

The post Why Data Centricity Is Key To Digital Transformation appeared first on DBpedia Association.

]]>
https://www.dbpedia.org/blog/why-data-centricity-is-key-to-digital-transformation/feed/ 0
FinScience: leveraging DBpedia tools for fintech applications https://www.dbpedia.org/blog/finscience-leveraging-dbpedia-tools-for-fintech-applications/ Wed, 16 Dec 2020 13:55:05 +0000 https://blog.dbpedia.org/?p=1397 DBpedia Member Features – In the last few weeks, we gave DBpedia members the chance to present special products, tools and applications and share them with the community. We already published several posts in which DBpedia members provided unique insights. This week we will continue with FinScience. They will present their latest products, solutions and […]

The post FinScience: leveraging DBpedia tools for fintech applications appeared first on DBpedia Association.

]]>
DBpedia Member Features – In the last few weeks, we gave DBpedia members the chance to present special products, tools and applications and share them with the community. We already published several posts in which DBpedia members provided unique insights. This week we will continue with FinScience. They will present their latest products, solutions and challenges. Have fun while reading!

by FinScience

A brief presentation of who we are

FinScience is an Italian data-driven fintech company founded in 2017 in Milan by Google’s former senior managers and Alternative Data experts, who have combined their digital and financial expertise. FinScience, thus, originates from this merger of the world of Finance and the world of Data Science.
The company leverages founders’ experiences concerning Data Governance, Data Modeling and Data Platforms solutions. These are further enriched through the tech role in the European Consortium SSIX (Horizon 2020 program) focused on the building of a Social Sentiment for financial purposes. FinScience applies proprietary AI-based technologies to combine financial data/insights with alternative data in order to generate new investment ideas, ESG scores and non-conventional lists of companies that can be included in investment products by financial operators.

The FinScience’s data analysis pipeline is strongly grounded on the DBpedia ontology: the greatest value, according to our experience, is given by the possibility to connect knowledge in different languages, to query automatically-extracted structured information and to have rather frequently updated models.

Products and solutions

FinScience daily retrieves content from the web. About 1.5 million web pages are visited every day on about 35.000 different domains. The content of these pages is extracted, interpreted and analysed via Natural Language Processing techniques to identify valuable information and sources. Thanks to the structured information based on the DBpedia ontology, we can apply our proprietary AI algorithms to suggest to our customers the right investment opportunities.Our products are mainly based on the integration of this purely digital data – we call it “alternative data”- with traditional sources coming from the world of finance and sustainability. We describe these products briefly:

  • FinScience Platform for traders​: it leverages the power of machine learning to help traders monitor specific companies, spot new trends in the financial market, give access to an high added-value selection of companies and themes.
  • ESG scoring​: we provide an assessment of corporate ESG performance, by combining internal data (traditional, self-disclosed data) with external ‘alternative’ data (stakeholder-generated data) in order to measure the gap between what the companies communicate and what is stakeholder perception related to corporate sustainability commitments.
  • Thematic selections of listed companies​ : we create Trend-Driven selections oriented towards innovative themes: our data, together with the analysis of financial specialists, contribute to the selection of a set of listed companies related to trending themes such as the Green New Deal, the 5G technology or new medtech applications.

FinScience and DBpedia

As mentioned before, FinScience is strongly grounded in the DBpedia ontology, since we employ Spotlight to perform Named Entity Recognition (NER), namely automatic annotation of entities in a text. The NER task is performed with a two step procedure. The first step consists in annotating the named entity of a text using ​ DBpedia Spotlight​. In particular, Spotlight links a mention in the text (that is identified by its name and its context within the text) to the DBpedia entity that maximizes the joint probability of occurrence of both. The model is pre-trained on texts extracted from Wikipedia. Note that each entity is represented by a link to a DBpedia page (see, e.g. ​ http://dbpedia.org/page/Eni​ ), a DBpedia type indicating the type of the entity according to ​ this​ ontology and other information.

Another interesting feature of this approach is that we have a one to one mapping of the italian and english entities (and in general any language supported by DBpedia), allowing us to have a unified representation of an entity in the two languages. We are able to obtain this kind of information by exploiting the potential of ​ DBpedia Virtuoso​, which allows us to access DBpedia dataset via SPARQL. By identifying the entities mentioned in the online content, we can understand which topics are mentioned and thus identify companies and trends that are spreading in the digital ecosystem as well as analyzing how they are related to each other.

Challenges and next steps

One of the toughest challenges for us is to find an optimal way to update the models used by DBpedia Spotlight. Every day new entities and concepts arise and we are willing to recognise them in the news we analyze. And that is not all. In addition to recognizing new concepts, we need to be able to track an entity through all the updated versions of the model. In this way, we will not only be able to identify entities, but we will also have evidence of when some concepts were first generated. And we will know how they have changed over time, regardless of the names that have been used to identify them.

We are strongly involved in the DBpedia community and we try to contribute with our know-how. Particularly FinScience will contribute on infrastructure and Dockerfiles as well as on finding issues on the new released project (for instance, ​wikistats-extractor​).

A big thank you to FinSciene for presenting their products, challenges and contribution to DBpedia.  

Yours,

DBpedia Association

The post FinScience: leveraging DBpedia tools for fintech applications appeared first on DBpedia Association.

]]>
PoolParty Semantic Suite: The Ideal Tool To Build And Manage Enterprise Knowledge Graphs https://www.dbpedia.org/blog/poolparty-semantic-suite/ Fri, 04 Dec 2020 09:37:31 +0000 https://blog.dbpedia.org/?p=1386 DBpedia Member Features – In the coming weeks, we will give DBpedia members the chance to present special products, tools and applications and share them with the community. We will publish several posts in which DBpedia members provide unique insights. This week the Semantic Web Company will present use cases for the PoolParty Semantic Suite. […]

The post PoolParty Semantic Suite: The Ideal Tool To Build And Manage Enterprise Knowledge Graphs appeared first on DBpedia Association.

]]>
DBpedia Member Features – In the coming weeks, we will give DBpedia members the chance to present special products, tools and applications and share them with the community. We will publish several posts in which DBpedia members provide unique insights. This week the Semantic Web Company will present use cases for the PoolParty Semantic Suite. Have fun while reading!

by the Semantic Web Company

About 80 to 90 percent of the information companies generate is extremely diverse and unstructured — stored in text files, e-mails or similar documents, what makes it difficult to search and analyze. Knowledge graphs have become a well-known solution to this problem because they make it possible to extract information from text and link it to other data sources, whether structured or not. However, building a knowledge graph at enterprise scale can be challenging and time-consuming.

PoolParty Semantic Suite is the most complete and secure semantic platform on the global market. It is also the ideal tool to help companies build and manage Enterprise Knowledge Graphs. With PoolParty in place, you will have no problems extracting value from large amounts of heterogeneous data, no matter if it’s stored in a relational database or in text files. The platform provides comprehensive tools for the management of enterprise knowledge graphs along the entire life cycle. Here is a list of the main use cases for the PoolParty Semantic Suite:

Data linking and enrichment

Driven by the Linked Data initiative, increasing amounts of viable data sets about various topics have been published on the Semantic Web. PoolParty allows users to use these online resources, amongst them DBPedia, to easily and quickly enrich a thesaurus with more data.

Search and recommender engines

Arrive at enriched and in-depth search results that provide relevant facts and contextualized answers to your specific questions, rather than a broad search result with many (ir)relevant documents and messages – but no valuable input. PoolParty Semantic Suite can be used to implement semantic search and recommendations that are relevant to your users.

Text Mining and Auto Tagging

Manually tagging an entire database is very time-consuming and often leads to inconsistent search results. PoolParty’s graph-based text mining can improve this process making it faster, consistent and precise. This is achieved by using advanced text mining algorithms and Natural Language Processing to automatically extract relevant entities, terms and other metadata from text and documents, helping drive in-depth text analytics.

Data Integration and Data Fabric

The Semantic Data Fabric is a new solution to data silos that combines the best-of-breed technologies, data catalogs and knowledge graphs, based on Semantic AI. With a semantic data fabric, companies can combine text and documents (unstructured) with data residing in relational databases and data warehouses (structured) to create a comprehensive view of their customers, employees, products, and other vital areas of business.

Taxonomies, Ontologies and Knowledge Graphs That Scale

With release 8.0 of the PoolParty Semantic Suite, users have even more options to conveniently generate, edit, and use knowledge graphs. In addition, the powerful and performant GraphDB by Ontotext has been added as PoolParty’s recommended embedded store and it is shipped as an add-on module. GraphDB is an enterprise-level graph database with state-of-the-art performance, scalability and security. This provides greater robustness to PoolParty and allows you to work with much larger taxonomies effectively.

A big thank you to the Semantic Web Company presenting use cases for the PoolParty Semantic Suite. 

Yours,

DBpedia Association


The post PoolParty Semantic Suite: The Ideal Tool To Build And Manage Enterprise Knowledge Graphs appeared first on DBpedia Association.

]]>
TerminusDB and DBpedia https://www.dbpedia.org/blog/terminusdb-and-dbpedia/ Fri, 27 Nov 2020 09:51:00 +0000 https://blog.dbpedia.org/?p=1377 DBpedia Member Features – In the coming weeks, we will give DBpedia members the chance to present special products, tools and applications and share them with the community. We will publish several posts in which DBpedia members provide unique insights. This week TerminusDB will show you how to use TerminusDB’s unique collaborative features to access […]

The post TerminusDB and DBpedia appeared first on DBpedia Association.

]]>
DBpedia Member Features – In the coming weeks, we will give DBpedia members the chance to present special products, tools and applications and share them with the community. We will publish several posts in which DBpedia members provide unique insights. This week TerminusDB will show you how to use TerminusDB’s unique collaborative features to access DBpedia data. Have fun while reading!

by Luke Feeney from TerminusDB

This post introduces TerminusDB as a member of the DBpedia Association – proudly supporting the important work of DBpedia. It will also show you how to use TerminusDB’s unique collaborative features to access DBpedia data.

TerminusDB – an Open Source Knowledge Graph

TerminusDB is an open-source knowledge graph database that provides reliable, private & efficient revision control & collaboration. If you want to collaborate with colleagues or build data-intensive applications, nothing will make you more productive.

TerminusDB provides the full suite of revision control features and TerminusHub allows users to manage access to databases and collaboratively work on shared resources.

  • Flexible data storage, sharing, and versioning capabilities
  • Collaboration for your team or integrated in your app
  • Work locally then sync when you push your changes
  • Easy querying, cleaning, and visualization
  • Integrate powerful version control and collaboration for your enterprise and individual customers.

The TerminusDB project originated in Trinity College Dublin in Ireland in 2015. From its earliest origins, TerminusDB worked with DBpedia through the ALIGNED project, which was a research project funded by Horizon 2020 that focused on building quality-centric software for data management.

ALIGNED Project with early TerminusDB (then called ‘Dacura’) and DBpedia


While working on this project and especially our work building the architecture behind Seshat: The Global History Databank, we needed a solution that could enable collaboration among a highly distributed team on a shared database whose primary function was the curation of high-quality datasets with a very rich structure. While the scale of data was not particularly large, the complexity was extremely high. Unfortunately, the linked-data and RDF toolchains was severely lacking – we evaluated several tools in an attempt to architect a solution; however, in the end we were forced to build an end-to-end ourselves.

Evolution of TerminusDB

In general, we think that computers are fantastic things because they allow you to leverage much more evidence when making decisions than would otherwise be possible. It is possible to write computer programs that automate the ingestion and analysis of unimaginably large quantities of data.

If the data is well chosen, it is almost always the case that computational analysis reveals new and surprising insights simply because it incorporates more evidence than could possibly be captured by a human brain. And because the universe is chaotic and there are combinatorial explosions of possibilities all over the place, evidence is always better than intuition when seeking insight.

As anybody who has grappled with computers and large quantities of data will know, it’s not as simple as that. Computers should be able to do most of this for us. It makes no sense that we are still writing the same simple and tedious data validation and transformation programs over and over ad infinitum. There must be a better way.

This is the problem that we set out to solve with TerminusDB. We identified two indispensable characteristics that were lacking in data management tools:

  1. A rich and universally machine-interpretable modelling language. If we want computers to be able to transform data between different representations automatically, they need to be able to describe their data models to one another.
  2. Effective revision control. Revision control technologies have been instrumental in turning software production from a craft to an engineering discipline because they make collaboration and coordination between large groups much more fault tolerant. The need for such capabilities is obvious when dealing with data – where the existence of multiple versions of the same underlying dataset is almost ubiquitous and with only the most primitive tool support.

TerminusDB and DBpedia

Team TerminusDB took part in the DBpedia Autumn Hackathon 2020. As you know, DBpedia is an extract of the structured data from Wikipedia.

Our Hackathon Project Board

You can read all about our DBpedia Autumn Hackathon adventures in this blog post.

Open Source

Unlike many systems in the graph database world, TerminusDB is committed to open source. We believe in the principals of open source, open data and open science. We welcome all those data people that want to contribute to the general good of the world. This is very much in alignment with the DBpedia Association and community.

DBpedia on TerminusHub

TerminusHub is the collaborative point between TerminusDBs. You can push data to you colleagues and collaborators, you can pull updates (efficiently – just the diffs) and you can clone databases that are made available on the Hub (by the TerminusDB team or by others). Think of it as GitHub, but for data.

The DBpedia database is available on TerminusHub. You can clone the full DB in a couple of minutes (depending on your internet connection of course) and get querying. TerminusDB uses succinct data structures to compress everything so it makes sharing large database feasible – more technical detail here: https://github.com/terminusdb/terminusdb/blob/dev/docs/whitepaper/terminusdb.pdf for interested parties.

TerminusDB in the DBpedia Association

We will contribute to DBpedia by working to improve the quality of data available, by introducing new datasets that can be integrated with DBpedia, and by participating fully in the community.

We are looking forward to a bright future together.

A big thank you to Luke and TerminusDB presenting how TerminusDB works and how they would like to work with DBpedia in the future.

Yours,

DBpedia Association

The post TerminusDB and DBpedia appeared first on DBpedia Association.

]]>