Category Archives: Big Data

Security and Privacy in the Cloud

Hortonworks announced their plan to acquire XA Secure and open source it. XA Secure claims it is a comprehensive approach to Hadoop security. This made me think of the the various aspects of security in the cloud.

Security in the cloud spans across multiple layers that involve people, compute, network and storage. Security in the cloud  requires an integrated strategy of process and tools, to allow end users be able to complete their work in an environment that enforces compliance without getting in their way.images1

Here is how I think of the top 5 areas of focus for security in the cloud.


Application security mainly deals with protecting the application resources. This includes a multi-pronged approach to cover the following:

  • Enforcing strong authentication and authorization
  • Date encryption on the wire: End-to-end encryption using SSL for all connections, both browser and APIs
  • Data encryption for data at rest
  • Data encryption for data in memory
  • Application white listing
  • Role based access to application resources
  • Session tracking
  • Controls for privileged or elevated access
  • Enforce context awareness and notifications

Focus Area 2:  DATA SECURITY

According to Forrester’s TechRadar report () on Data security, security is the second largest portion of the IT budget. In 2014, the investment is expected to rise by 45%. Data security is no more an IT issue. It is an important business driver since data is now closely tied to the the financial cost of companies and  the business damage that it can cause as a result of data breaches.

Data masking and Data Loss Prevention(DLP) offerings are best suited for addressing data security. To enforce security on the data you would want to know:

  • Where the data exists (both structured and unstructured) to secure it
  • Continuously monitoring access to the data
  • Protecting both production and non-production data
  • Regular audits for maintaining compliance


Explosive growth in data and digital assets in the cloud , drives the need for high performance reliable network and storage. This calls for sensitive information flowing through the network and storage to be encrypted both in-motion and at rest.

With customers requiring the need to continue to productively use their prior investments on software, the hybrid cloud is pushing needs for cloud security to operate in a hybrid model. In such hybrid environments there is need to support secure links and encryption across on-premise networks and storage units.

Some of the important features to pay attention around Network and Storage Security are

  • Authentication
  • Confidentiality and Data level protection
  • Certifications  for compliance with legislative and regulatory mandates
  • Privileged user access and separation of duties
  • Centralized key management
  • Realtime monitoring of traffic across network

Focus Area 4:  DATA PRIVACY

In this digital age especially in the cloud where we end up capturing personal identifiable information or other sensitive information is collected and stored, privacy concerns are highly prominent. The challenge of data privacy is to share data while protecting personally identifiable information. Data privacy has become of a very high priority in certain markets like Healthcare, Criminal Justice, Financial, Life Sciences and more. These days the laws for the protection of privacy have been adopted worldwide , but their definitions and objectives vary from one country to another.

It is important that the cloud vendors make sure that their cloud offerings gets certified under EU, US and other Safe Harbor Programs.

Focus Area 5: DATA CENTERS

Primarily due to cost effectiveness, customers are adopting cloud and hybrid services as their business model in various stages of their business cycle. This is driving data centers to adopt  virtualization technologies to rapidly expanding their data center infrastructures reliably and effectively into the cloud.

Some of the common challenges around security in the data center are:

1. Multi-Tenancy  

The resources belonging to multiple customers reside on the same physical platforms. Proper security measures must be adopted such that customer data cannot be breached or spilled over, even if the multiple customers are leveraging the same resources and platform in the virtual environment.

2. Compliance and Privacy Restrictions

Even though the infrastructure and resources of the data centers are managed by the cloud vendor, they should be prevented from monitoring and auditing any components or data. This includes preventing them from inspecting the network through which customer data will be passing because of compliance and privacy restrictions. The cloud vendors should think through these privacy and compliance challenges so you can clearly isolate these tasks and provide ownership to the customers to manage, monitor and audit on their own. Providers may need to comply with the ISO17799 based policies and procedures and be regularly reviewed as part of the SAS70 Type II audit process.

In summary, security enforcement in data centers involves

  • Data Protection at the application, network and storage through access control and encryption
  • Protecting systems through hardening, intrusion detection and prevention
  • Monitoring and Auditing through certifications to meet compliance regulations, change control around upgrades and patches, proper role and privileged access management.

What it means to adopt a Cloud strategy?.

Cloud computing in the areas of  Platform as a Service (PaaS), Infrastructure as a Service (IaaS) and Software as a service (SaaS) were the words of 2012. Vendors like SAP, IBM,Microsoft, RedHat, Oracle, VMWare and Citrix all entered this space early on and now we see that these solutions are  evolving into second generation products in 2013 (Read more at

Now that cloud computing is making a huge impact in other market areas like big data, social and mobility, to help drive and support new business scenarios, we will see more and more hardware and software vendors embarking this journey around their products and solutions. ( See: Gartner: 10 critical IT trends for the next five years)

Screen Shot 2013-01-14 at 2.12.46 PM

Source: Business value of Cloud Computing

Benefits of the cloud offerings is often associated with reducing cost and increasing agility.  While this is true, the more strategic role that cloud solutions can play for the customers and the vendors are in achieving operational excellence, product leadership, customer intimacy, and open innovation.  Cloud computing is part of a long and powerful trend towards virtualization. Virtualization acts as a stepping stone for cloud which mainly helps to bring down the operation cost down, at the same time facilitate speed and agility in deployment and maintenance in the long run.

Given the above factors, the following are typical areas to consider when thinking of ROI when adopting a cloud strategy:

                  1. Hardware costs – how much will this save in terms of the servers and storage devices.
                  2. Maintenance for the hardware – will there be any savings ?
                  3. Software licenses cost – usually the license post for a cloud solution is priced less than on premise. How much cost can be reduced per seat?
                  4. Maintenance for the software – include both the vendor support and your internal support costs
                  5. Facilities costs – can you lower the power, HVAC, building costs etc.?
                  6. Productivity/efficiency costs– what is the learning curve, are the people who will use the new system more productive? what is the cost involved for training?
                  7. Agility around new opportunities – are you able to respond faster, but cheaper, to opportunities that otherwise would have taken more development time and money?

Security Intelligence – Role of Big Data in Fraud Prevention and Management

Fraud is a serious problem and requires new way of thinking to address this problem. Irrespective of the market  type whether its financial services, online retail, point of sale or healthcare, fraud prevention and management is the biggest pain point for all customers these days.

In the security market  to address Fraud, the real-time security intelligence along with the power of  Big Data is spearheading the growth of solution vendors to innovate and differentiate their solutions from old-school security vendors.

Fraud causes companies to lose money in many ways. These days there is a greater need for a real-time solution to help to organizations automatically detect the anomalies with their users or system behaviours early on, which then can help  to notify and  take appropriate action This will prevent fraud and the loss of revenue.

Let’s take the example of healthcare to list out some of the well know challenges around fraud.

  • Organized groups defrauding insurance companies through elaborate schemes against government-sponsored programs or private health insurers
  • Patient medical IDs are stolen or duplicated for financial benefits
  • User impersonation for prescription drug benefits and many more…

Meanwhile, hospitals and HMO pay a heavy price  through fines and litigations if they don’t comply to all the Healthcare laws that are enforced by the government.  So, they  have to ensure appropriate checks and measures to prevent violations by their users/patients/doctors when they use the applications and systems.

Old school way of Fraud management:

Screen Shot 2013-01-08 at 12.48.13 PM

Most companies have invested and adopted multi-factor authentication methods (ex: password, smart cards, One-Time Password (OTP), biometrics etc) as an only mechanism  to identify and protect their users who are using their applications and systems but also a way to manage fraud. The picture here suggests a mechanism that they enforce currently to do a fraud evaluation.

These companies have quickly understood that multi-factor authentication alone cannot scale and address fraud issues since the bad guys have figured out a way to break through these multi-factor authentication mechanisms.

This is why there is a need for  real-time intelligence  security solution!.

Real-time Security Intelligence through Big Data

The challenges that makes realtime intelligence gathering the right approach to address fraud are:

  1. No single layer or a multi factor authentication is enough to keep determined fraudsters out of enterprise systems. Multiple layers must be employed to defend against today’s attacks and those that are yet to appear.
  2. No authentication measure on its own, especially when communicating through a browser, is sufficient to counter today’s threats. Additional fraud prevention layers must be utilized.
  3. Malware is the biggest immediate threat,  malware-based attacks are spreading to multiple sectors and enterprises.

Picture _raw1 2Let’s take the example of an online retail scenario where users have to shop for good through the browser supported on the PC, smart phone or the tablet.

Like the picture shows, a typical user will make multiple clicks and will interact with multiple applications in the background through a browser before he gets to the shopping cart. This would mean there is a way for us to gather a lot more data and information about the user and analyse his behavior realtime

Here are come of the steps that will help us build real-time intelligence around the user behavior:

1. End point Data : involves capturing context of users at the endpoint which is his device. For example is he using the browser on a PC, desktop, tablet, smart phone.  Capture the user’s IP, geo-location, authentication credentials and many more.

2. Session Data:  gather, monitor and analyze user’s session (ex. http post parameters and other session attributes) and his navigation behavior on the browser.  Compare this with his earlier navigation patterns to identify abnormal patterns based on his transitional history.

3. User Data: gather to monitor and analyzes user’s behavior to identify any anomalous behaviors during the transaction .

4.  Context Analysis:  Analyse the relationships among internal and/or external entities, systems and their attributes (for example, users, accounts, account attributes, machines and machine attributes etc.). Analyze the application logs, system logs, database logs and build predictive models for the user behaviour around applications and the systems involved.

The intelligence gathering and analysis in the above steps involves gathering the right data and also analyzing the data with an effective algorithm. This is where the Big Data plays a role to help build an effective and accurate model based on the user’s interaction with the application and system, that will help detect anomalies and prevent and manage fraud efficiently.

The secret to the success of such a Real-time Security Intelligence solution boils down to the quality of data collection and the advanced algorithms that addresses the 3 Vs of Big Data not only to build accurate predictive models but also support self learning for the solution to get smarter over time.

Big Data: Why Enterprises need to start paying attention to their Data sooner?

The awareness around Big Data is  on the rise and is exciting!. As we all know in the technology  space the word Big Data revolves around the 3 V’s, the Volume, Velocity and Variety of  the data that is typically seen in all enterprises these days.

Picture 1

The blog on visualize the 3V concept is a good resource that provides a view into Big Data  if you are not so familiar with have this question: “What is Big Data?.

It’s 2013, the time is so right  for all enterprises to pay more attention to their Big Data. With the right technology and processes around their Big Data, enterprises can now trigger new ideas around business growth in 2013.

Some of the exciting new strategies that enterprises should look at for their Big Data are:

1.  Build advanced predictive models with the information that they already have around their customers and products to create new product and marketing services that will help to differentiate them from their competitors.

2. Data Mining that will help to understand their customers buying persona that will facilitate to capture new customers and markets.

3. Real-time analytics to understand the past behavior patterns of the customers which then will provide greater ability to satify the existing customers by providing personalized  services that is relevant to meet their needs and wants.

The Beginning…

Facebook hit the Big Data issues where they had to process huge amounts of structured ( ex. …)and unstructured ( ex. video, email, text)  data half decade ago. Facebook joining forces with Yahoo then lead to the creation of Hadoop, a software platform for processing and analyzing epic amounts of data streaming across the modern web. These days the social media platforms like Twitter and LinkedIn have to deal with Big Data to keep their system operational. Guess what they are using  to process and manage their Big Data. It is all done through Hadoop. Today we have eBay, and dozens of other high-profile web vendors are using Hadoop to analyzes their vast amounts of data generated during their online operations.

Reshaping the Business Model around Big Data

Most enterprise have Big Data that they have gathered in their data warehouses over the years. But they do not know how to use them nor do they know what the benefits that the various data that they have gathered over the years or the new data that they can collect will help. This is why business needs to spend more time to understand the importance of their existing data and think of ways that they can incorporate data which can help them to grow their revenue.

Let’s look at some examples to understand the value of Big Data in these specific markets. Mobile applications, tablets and smartphones are creating customers and services to consume and integrate structured and unstructured data from a variety of sources.

1. HealthCare Market:

Business objective:  Providing, enhancing and streamlining how hospitals connect with and care for their patients. Develop and facilitate personalized therapies and diagnostics to the patients

Big Data opportunity :  Incorporate a Big Data analysis engine to build predictive models against patients cynical history, genetics, blood work etc.models.

Why: This will facilitate the doctors to make best treatment recomendations in a timely fashion for their patients. This will help to offer the best care at the same time reducing the healthcare cost by avoiding unnecessary treatments to patients.

2. Retail Online Market :

Business objective:  Revolves around connecting the merchants with the consumers in a more effective way such that the consumers can find what they want conveniently and effectively  in a timely fashion. This would require merchants to know what the consumers are looking, when and where.

Big Data opportunity :  Incorporate a Big Data analysis engine that builds predictive models that will help to make better decisions

  • Build consumer models with their transactional history, buying pattern, interest in types of goods, browsing pattern, buying power pattern in $ amount etc.
  • Generate a catalogue for the Merchants based on the type of goods, price, value and access.

Why:  The predictive models will help to effectively connect the merchants with the consumer so its a win-win for all business entities.

3. Financial Market

Business objective:  Provide a High-Performance Trading platform that is effective,accurate and reliable

Big Data opportunity :  Advanced analytical engine that will allow for the analysis of complex data sets and the ability to connect patterns and relationships applied to analysing news, social media feeds, scanning incoming emails, or disecting company regulatory filings to generate predictable models

Why: This will facilitate the traders to make effective and accurate trading decisions that is profitable

Big Data Technology and Solutions:

With the 3 Vs around Big Data, enterprises will have to look at the technologies, solutions and data stores that will help them to be successful with Big Data.

Screen Shot 2013-01-08 at 12.13.28 PM

Big Data  Technology & Data Stores: There are lot of vendors that can offer products around Big Data software platforms and data stores. This was the first areas that got a lot of attention from vendors to address the Data management, processing and operational issues around Big Data. Machine Learning engines are still evolving which will help build accurate and a reliable predictive models . Because of the nature  of volume, variety and the velocity with which Big Data has to processed it requires an accurate and a reliable model-building process which has to be automated through advanced algorithms to be effective.

Big Data Solutions:Picture 3  Right now most of enterprises are trying to build specific tailored solutions in-house to address their basic needs. The Big Data solution space is  still a evolving  and there is lot of opportunities for innovation and creativity  The solution market  for Big Data is still an untapped market.

The story is a bit different when it comes to realtime analytics. Enterprises clearly understand the importance of real-time analytics and how it provides a value to the current business. As a result  there are vendors who have already built cool realtime analytical solutions that the market wants and that help enterprises reshape their existing business model. 


%d bloggers like this: