Tuesday, October 20, 2009

Designs for making your searches faster - Covering the basics - Part 1

Search is a universal requirement for any web application. Be it a simple collaboration portal or a complex trading system, searches are the most frequently used feature of any application.
Searches can generally be categorized into types at a high level,
  • Simple one box search that searches everything for the content you are looking for
  • Focused searches that help you seek a single entry from a large data repository.
There are many more variations of each type, but they all generally fall into either category.
The former, that is, the one box search is mostly provided in content portals and informative web sites that cater to the general public that may not be attuned to the nuances of using the latter type or simply for the requirement of keeping it simple.
Behind the scenes, these searches are implemented using complex approaches that we will not debate in this article.
In this part we will focus on the latter.

Transaction Searches
Transaction searches are prevalently used in enterprise web applications that are targeted at a trained user base . This user base knows exactly what it wants and may even provide adequate data inputs to help the system swiftly narrow down to the single entity they are looking for.
Here is a sample screen shot of a search screen from a complex web based enterprise application

courtesy of http://crowdfavorite.com/tasks-pro/
The easiest route adopted by designers when implementing these types of searches is a direct search on the database followed by retrieval of all results and displaying them in an HTML table with a next or a previous button.
Clicking the next or previous will repeat the process, now the content displayed will skip the ones for the previously displayed page.
While this is the easiest route, this always has many disadvantages
  1. very high resource consumption
  2. very low performance
Let us look at why this approach is bad.

Very High Resource Consumption
  • Every pagination attempt does a full search on the database retrieves all the rows ,
  • Transports them to the application server in part or full clogging up the network bandwidth
  • Application code will use only what is required for that page and discards the rest.

Very low performance
  • Hitting the database for every page will stress the application resources under load and cannot provide optimal performance
What Can I do? How can I improve my application performance?
Fortunately, there are many options available. Unfortunately there is no one size that will fit all cases and therefore one must decide the best options to exercise for their context.
Let us look at some basics that must be covered before we look at the options

Basics
  1. Know your SLAs
  2. Know your load conditions and load patterns
  3. Know your requirements and most frequently used usecases

Knowing your SLAs

Service level agreements define the measure or the success criteria of a performance of an application. It is always a good practice to have well segregated SLAs for your application that can be classified as
  • The general performance SLA of all pages in the application
  • Specific SLAs for mission critical features of the application
For instance, in a retail banking website, all general inquiry features could have an SLA of 3 seconds whereas the "third party fund transfer" , "Cash Withdrawal Update" features would have an SLA of 2 seconds.
It is also very important that all SLAs are defined under specific load conditions. For instance, a table such as the one given below could be used for this purpose






FeatureUser Loadperformance
All pages<>2 secs
All pages> 100 and <>4 secs
Third Party Fund Transfer <>2 secs

Knowing your Load Conditions and Patterns

It is very important for designers to understand load conditions and patterns for your application. Load conditions will help quantify the amount of transactions that reside in your database at any point in time. In the context, it is also important to understand the quantum of data to be processed for providing the search feature.
For instance we could face scenarios where a simple five column table with not more than one thousand rows or we could be dealing with a huge document meta data repository with hundreds and hundreds of attributes running to millions or rows.
It is very important to know the data load that one has to deal with when evolving a solution for the search.
Load patterns are as important as the load conditions. Some applications face minimal load through the month and face twice or thrice the average load during a three day period during the first week of a month. It is extremely important to factor in this aspect when building our solution as customers and the application are the most stretched out during these periods and generally tend to make or break the success of the application with the business.
The usage pattern is an important item that the business analyst must cover when developing usecases.

Knowing your Requirements and most frequently used Usecases

Knowing the requirements of the search is a brainer, but what most people fail to do is
  • Analyze the requirements to understand how an end user will use the search feature
  • What will be the most frequently used fields
  • What are the criteria the user is most likely to fill in full and those in part
  • Should there be some criteria that must be mandated?
  • Should the criteria be designed such that the user can be facilitated to provide accurate criteria?
The above points can be discussed in much detail, but i will focus on one.
Facilitating the user to provide search criteria that is as complete as possible is one of the most critical steps that will go a long way in improving overall performance.
For instance, a bank customer looking for transferring funds to a customer in another branch. This feature requires a minimum customer name and branch name to list the options. In this case, providing a quick lookup option to pick the branch accurately helps the system narrow down the name search to only customers belonging to that branch as opposed to the bank's entire customer base.
There is a very high probability that a user would not have specified branch name had there not been a quick lookup provided and the branch mandated. Note that just mandating the branch name without a lookup will only make the user provide approximate or even incorrect names leading to multiple searches.
Knowing the most frequently used usecases is again an often overlooked feature. This seemingly insignificant point helps a lot in deciding the right solution to adopt for a context. Let us for a moment look at two usecases
  1. Banker searches account ledger
  2. Customer searches account history
Both are legitimate usecases with legitimate business contexts. However, one can immediately spot that the banker searching the account ledger is most likely to happen many a times in a day than that of the customer searching the account history.
It is important to appreciate the usage when designing a solution for these searches as it is important to provide a powerful solution for the banker search while customer search can be relegated to a lower powered search.
This translates to not only cost savings for the developers, but also saving system resources and avoiding over engineered solutions.

We have now covered some basic aspects that are a must for evolving the right solution to your search problem.
In the next part, we will look at some common and some uncommon solutions. Until then keep your feedback coming.

0 comments:

Post a Comment

 

My Blog List

Site Info

Followers