We do full-stack Data Science, Research, and Engineering. That means that we take the road together with you from (1) conceptualizing your problem, towards (2) specifying, storing, and managing the dataset(s) that we need, (3) coding exploratory analytics and Machine Learning procedures to process your data, and (4) learn as much as we can from them, finally (5) reporting and/or deploying in a clear and concise manner.

Machine Learning

We employ various Machine Learning approaches in our work to develop AI solutions that will support your business or organization. Supervised and unsupervised learning are both used depending on the context of the problem you want us to solve. From customer and market segmentation, choice models, NLP, extracting semantic themes from documents, to causal networks and social media analysis – statistical learning models are the key players in our work.

Information Retrieval

We take the road from unstructured to structured data for you and share any new insights hidden in the source data along the way: NLP and Text-Mining, Word-Sense Disambiguation, Entity-Linking, Semantic Topic Modeling, and more. We are Semantic Web specialists and can bring the most powerful knowledge structures supported by AI/ML to you. We discover things not present prima facie in the documents or websites and shed light on latent connections between various sources.

Figure 1. Graphing the similarity structure between document paragraphs w. Rgraphviz.

Predictive Analytics

We analyze and then predict the choices made by your customers, users – or even your machines! Generalized Linear Models, Choice Under Risk and Uncertainty, Decision Trees and Random Forests, and other models as well enter. The focus is on building explainable choice models that not only predict choice but uncover their inner logic to put you in control of the process. If one cannot act upon the knowledge one gathers..? Highlights: predicting query processing time from an NLP-like analysis of SPARQL queries with 91% accuracy.

Figure 2. Utility-dependent and Bayesian inference-driven probability weighting in human choice under risk.

Social Media Analysis

Network analysis applied to social media: you specify the platform and we will deliver a comprehensive report on user behavior in your campaign, on your fan page, on Twitter, and elsewhere. Powerful interactive visualizations and algorithms are used to reduce the complexity of the source data and help us understand and improve the effort made.

Figure 3. A chord diagram. Just a chord diagram.

Experimental Behavioral Analysis

With more than twenty years of experience in designing and analyzing behavioral experiments, we can develop full experimental designs to test any relevant hypothesis on the behavior of your user or customer base. We also develop custom behavioral models of experimental data where we align the concepts used in the model with the concepts that you naturally use to think about your customers or users.

Visualizations, Dashboards & Reports

We use the RStudio Shiny technology stack for front-end development in R, as well as R Markdown, flexdashboard, D3 interactive visualizations, and many more to provide full interactive reporting for your business or organization. Putting all these into production and deploying on yours or our infrastructure is also our job.

Figure 4. Studying and visualizing open data, presented in Startit, Belgrade, 2017.

Databases & Big Data

We provide Data Science services for Wikidata, the largest and most complex open knowledge base in the World. And of course, we can manage your data too. A range of solutions on virtualized and standard infrastructures are available depending upon your needs and what exactly needs to be done with what is stored and managed. We are highly skilled professionals in making R and Python interact with Big Data systems such as Hadoop or Apache Spark, and high-performance R for processing large data sets.

Figure 5. The Wikidata Identifiers Landscape, developed for Wikimedia Deutschland.
Links between thousands of Wikidata’s external IDs are visualized.