Java and AI to Boost Your Business 

(And How You Can Get Started with Your Own Customized Recommendation Engine

One of the hottest tech topics today is artificial intelligence, or AI. Estimates suggest that 77% of the global population uses it in some form, and a recently released report by the IDC projects global spending on AI will double in the next 4 years to $110B annually. AI has already implanted itself in our daily lives as Siri, Alexa, search engine algorithms, and various chatbots all rely on AI to service us.  

One hallmark of the numerous digital transformations ushered in by AI is the recommendation engine. The technology was pioneered by the tech titans Amazon and Netflix, who famously offered a $1M prize to anyone who could improve their recommendation engine by 10%.


The ability to anticipate a customer’s preferences and provide them recommendations has become a primary differentiator between digital companies and their legacy counterparts, spring boarding Amazon and Netflix to the cutting edge of their respective industries.


Take Norrøna, a leading brand of outdoor clothing in Scandinavia, which transitioned from wholesale to its first brick-and-mortar store and ecommerce site replete with recommendation engines in 2009. Over the next 10 years Norrøna grew into over 22 physical stores and is now generating approximately half of its revenue from direct-to-consumer sales


In Part 1, the preview article for this series, we discussed the journey of building a Java-based recommendation engine for a client’s ecommerce needs. Considering we had no prior experience with recommendation engines, we underwent a steep learning curve, from discovering the various choices of supporting technologies, to determining if we should build an engine from scratch or customize an existing one, to how we would eventually train it with data, becoming experts in building machine learning algorithms along the way.

Our initial efforts at building a recommendation engine used content-based filtering, a process that matched user interests with various products based on their previous searches or feedback.


Typically, the algorithms would look for products based on previous interests stored in the user profile and promote them to the user. However, one drawback to content-based recommendation engines is that they lose efficacy for more complex user profiles or when clear and differentiated descriptions are required. To tackle these drawbacks, we would discover more advanced machine learning algorithms that comprehensively analyzed user interactions.


In this second article, we showcase the next steps in developing complex recommendation engines and how we integrated AI into our own Java recommendation engine through sophisticated data filtering and enhanced matching capability.

We help our clients create impact across many industries. See how on our case studies page!

Setting the Context for Smart Recommendations 

First, it’s important to understand why we need an enhanced recommendation engine and to briefly differentiate between AI, machine learning, and deep learning.  

The premise of AI is to model human intelligence that can then mimic our actions and emotions. Humanoid robots, for instance, exhibit numerous attributes of artificial intelligence in their ability to move and replicate the human body.


A subset of AI, machine learning focuses on training algorithms with large datasets to train software to learn from its own “experiences,” much like a young child. Deep learning, in turn, is a more complex form of machine learning, based on even more advanced techniques using neural networks that mimic our own brain.  


As Maruti Techlabs states: “People generally like to be recommended things which they would like, and when they use a site which can relate to his/her choices extremely perfectly then he/she is bound to visit that site again.” Through replicating human intelligence recommendation engines provide users with a personalized interface that caters to their interests, ultimately boosting your sales and customer retention.  

The premise of AI is to model human intelligence that can then mimic our actions and emotions. Humanoid robots, for instance, exhibit numerous attributes of artificial intelligence in their ability to move and replicate the human body.


A subset of AI, machine learning focuses on training algorithms with large datasets to train software to learn from its own “experiences,” much like a young child. Deep learning, in turn, is a more complex form of machine learning, based on even more advanced techniques using neural networks that mimic our own brain.


 As Maruti Techlabs states: “People generally like to be recommended things which they would like, and when they use a site which can relate to his/her choices extremely perfectly then he/she is bound to visit that site again.” Through replicating human intelligence recommendation engines provide users with a personalized interface that caters to their interests, ultimately boosting your sales and customer retention.  

One of the foremost questions we had to ask ourselves at the outset of this project was should we build a recommendation engine from scratch for our client? We found a few major reasons why it would be advantageous:

  • Improve retention 
    Effective recommendation engines make the user experience more streamlined, encouraging them to continue using our client’s products
  • Increase purchase frequency
    Recommending products that align with a user’s interests encourages customers to purchase more products each visit
  • Re-purchase items 
    A well-built recommendation engine would remind customers to repurchase a product they bought in the past, boosting sales.

  • Promote a wider variety of products 
    Finally, a good recommendation engine utilizes the scope and variety of our client’s offerings, engaging the users with a wider variety of products 

Supervised vs. Unsupervised Machine Learning 

By adopting the artificial intelligence approach to building a recommendation engine we had to address the two primary types of machine learning filtering: supervised vs. unsupervised.


Supervised filtering is about feeding the machine learning algorithm with input data and providing an expected output, with the anticipation that it will learn to detect underlying patterns in labeled data; like determining what category a news article belongs to.


Unsupervised filtering, on the other hand, receives unlabeled data and is left to form its own patterns unsupervised by humans. After weighing both options, we found unsupervised filtering was the best choice for our client and served as a great learning opportunity.


Facebook is a pioneer in machine learning and recommendation engines. The company is well-known for its use of machine learning to collect data and recommend content to its billions of users. It uses hundreds of parameters and key performance indicators to anticipate and collect data on user interactions, factoring in everything from scrolling, comments, likes, shares, posts to external apps, polls, or advertisements into its neural networks.


Complex machine learning algorithms are on the bleeding edge of ecommerce and social media technology and can serve as a key differentiator from more antiquated companies.  

So, the question becomes this: What methodology would we use to provide users recommendations based on their preferences?


There were two types of data we could use, implicit and explicit feedback. When a new user registers, implicit feedback is the data that they provide interacting with the site or application, like the information they read, the duration of their visit, any products they interact with, what content they view, the links they visit, and other data that is fed into predictive algorithms.


Explicit feedback is any information the user provides in response to a specific request or question. However, before we could use more complex recommendation techniques, we needed a baseline to work with. Without any previous knowledge of the user, how could we train a system to learn their unique preferences? This is one of the challenges we faced in determining how to train our recommendation engine.

At Integrant, our Java engineers are ready and eager to take on your project!

Our Decision: Use Cosine Similarity

Fortunately, we landed upon an ingenious solution called cosine similarity. Cosine similarity is a technique used in machine learning textual analysis that measures the similarity of two documents using the word frequency as opposed to text length. Selva Prabhakaran explains it as follows:


  • Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The cosine similarity is advantageous because even if the two similar documents are far apart by the Euclidean distance (due to the size of the document), chances are they may still be oriented closer together. The smaller the angle, higher the cosine similarity.


To provide a brief overview, in cosine similarity the characteristics of two documents are projected as a vector in multi-dimensional space. The magnitude of the vector correlates to the number of times a word appears while the angle correlates to the frequency that word appears in the text, which is independent of the actual length of the text.


The power of cosine similarity is the ability to compare how similar two texts are using the cosine of the angle between their vectors, ascertaining an accurate result regardless of the difference in actual length, or the magnitude, of two documents. 

The magic behind the cosine algorithm is that it assigns a probability for a single term and determines the similarity of those words based on the frequency it appears. In our test case we used a database of 60,000 unique English words and assigned them a vector corresponding to each words frequency and the number of times it appears. Adopting this methodology, we were able to derive the similarity of the content of various documents using the cosine of the angle between their respective vectors. The smaller the angle, the greater the similarity.


For the database administrator using our recommendation engine we offer the ability weigh user interactions for their impact on the recommendation engine. For instance, you can determine that when a user “saves content for later” it has a certain weight on determining the predictive algorithm. This interaction then becomes a part of a set of predicted actions the user will be expected to perform, providing recommendations based on that data.  

To test the recommendation engine, we are currently experimenting with a movie database. For example, if a user likes watching dramas then when they add a movie of this genre to the watch list it will be assigned a certain weight by our recommendation engine. Then, if a customer happens to add a comedy movie to the watch list, we don’t automatically assume they have switched interests.


Furthermore, we trained the recommendation engine to assign certain numbers for weighted actions: “watching a movie” may be equivalent to a 20 while “adding a movie to watchlist” might be assigned a 5, thus making the algorithm more precise. By incorporating large datasets and measuring these aggregated and weighted interactions, the cosine similarity algorithm gave us new and exciting insights on creating effective recommendation engines.  

Interested in developing a recommendation engine for your business? Let us help you. Schedule your free technical consultation today!

Next Steps 

When we started on this journey of building a recommendation engine, we had no idea the challenges we were up against. However, through trial and error, we have stretched ourselves and gained amazing new insights and discoveries that have made us stronger as a team.


In the process, we learned how recommendation engines work, the best ways to filter and test data, the differences between supervised and unsupervised learning, and how to train machine learning algorithms to learn unsupervised.  

Our goal is to continue using the movie database test platform to further develop our skills at making accurate predictions about customer preferences. We’ve become more advanced in our understanding of how to filter and weigh datasets through cosine similarity, enabling enhanced matching capabilities for cutting edge recommendation engines.


Looking forward we plan to continue to strengthen our algorithms with large data sets that will predict sophisticated patterns and rationalize anomalies or variations in user behavior.


Does your business need a recommendation engine to increase customer retention and boost sales? Are you interested in harnessing the benefits of AI and machine learning-based problem solving to real business challenges?


If you’re looking for seasoned, enterprise-grade Java developers with experience building and testing machine learning algorithms then we’d be more than happy to share a demo of our work.


Give us a call today to discover how we can partner together on your next project!

Related Content

Our engineers passionately debated Java vs JavaScript as a solution for a middleware project. Are you surprised by their conclusion? Click to find out.  
Learn more about the benefits and drawbacks of Java 8 and how Java evolved into a modern language built to solve modern challenges.
Subscribe to our newsletter!

We've been in the software industry for  30+ years so we have a lot to share with you!

Follow US

Address: 16870 W Bernardo Dr, Suite 250

San Diego, CA 92127

Email: info@integrant.com
Phone: +1 858.731.8700

© 2021 Integrant, Inc. All Rights Reserved