Advertisement

Live Chat

Live Chat 5/29: Big Data & Security in the Supply Chain

89 comments on “Live Chat 5/29: Big Data & Security in the Supply Chain

  1. Ashu001
    May 29, 2014

    Test message

  2. Daniel
    May 29, 2014

    Hi all

  3. Hailey Lynne McKeefry
    May 29, 2014

    Hello all!

  4. Daniel
    May 29, 2014

    Hope this will be an interesting section

  5. Hailey Lynne McKeefry
    May 29, 2014

    We should be getting started at the top of the hour, as soon as our guests arrive.  First, though, there are two housekeeping notes:

    First, please make a copy of your post before hitting the “post” button – just in case.  If the system “eats” one of your carefully crafted thoughts, please hit “Ctrl-Z” to recover it

  6. Hailey Lynne McKeefry
    May 29, 2014

    Second, if you have problems posting, we suggest trying a different browser.  IE9 is a popular choice, but sometimes find Firefox, Chrome, or Safari work better.

  7. Daniel
    May 29, 2014

    Hi Hailey

  8. Hailey Lynne McKeefry
    May 29, 2014

    This will be a fun, fast, and friendly conversation, so please do not hold back with your comments or questions.  There are no dumb questions and we value everyone's point of view.

  9. Hailey Lynne McKeefry
    May 29, 2014

    Questions, theories, ideas, real world experiences and even friendly rants are welcome here.

  10. Hailey Lynne McKeefry
    May 29, 2014

    As you arrive, please introduce yourself so we can offer words of welcome, and offer you a seat as well as a bit of EBN's famous virtual guacamole and chips.

  11. Daniel
    May 29, 2014

    Hailey, control+z won't work in tablet and smartphone

  12. Hailey Lynne McKeefry
    May 29, 2014

    Hi Jacob, I'm glad you could join us! Feel free to throw out any starting thoughts or questions while we are waiting to get started.

  13. Daniel
    May 29, 2014

    Hailey both big data and security are hot topics in industry

  14. Daniel
    May 29, 2014

    Another 15 minutes to go for this interesting section

  15. Hailey Lynne McKeefry
    May 29, 2014

    They are…and of course, whenever you say Big Data and supply chain you have to add analytics…then privacy isn't far behind.

     

     

  16. dataguise
    May 29, 2014

    Hello from Dataguise!

     

  17. meetsingh
    May 29, 2014

    Hello Everybody. Just signed in Hello Hailey….

  18. meetsingh
    May 29, 2014

    This is Manmeet

  19. Hailey Lynne McKeefry
    May 29, 2014

    Welcome Dataguise and Manmeet. You are right on time!

     

  20. meetsingh
    May 29, 2014

    Thanks good to see you

  21. Hailey Lynne McKeefry
    May 29, 2014

    Today's guest is Manmeet Singh, co-founder and CEO of Dataguise, a provider of big data security intelligence and protection solutions. We'll be chatting about big data, security and the supply chain. It seems to be a really hot topic lately. Manmeet, what do you thing is behind that?

     

  22. Hailey Lynne McKeefry
    May 29, 2014

    Clearly data continues to explode: Market research firm IDC in a recent reportsaid:

     

    From 2005 to 2020, the digital universe will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes (more than 5,200 gigabytes for every man, woman, and child in 2020). From now until 2020, the digital universe will about double every two years.

  23. meetsingh
    May 29, 2014

    Well data is important and data is what drives an enterprise fuel of today and for supply chain too. Needs to be protected

  24. Jamescon
    May 29, 2014

    When we talk about security in the supply, where have the greatest issues been? Customer/partner data accessed? Intellectual property issues? Financial data?

  25. Hailey Lynne McKeefry
    May 29, 2014

    Glad to have you with us, Jim!

  26. Hailey Lynne McKeefry
    May 29, 2014

    @Jim, great question. 🙂

  27. meetsingh
    May 29, 2014

    Jim its mostly the Customer / Partner data and the Financial data 

  28. Hailey Lynne McKeefry
    May 29, 2014

    @Meetsingh, how would you characterize the electronics supply chain in terms of awareness and adoption around security and big data?

  29. Ashu001
    May 29, 2014

    Hi jacob great to hv u here!

  30. Ashu001
    May 29, 2014

    Did anyone bring the chips?

  31. Ashu001
    May 29, 2014

    @jimc-financial data is key

  32. Hailey Lynne McKeefry
    May 29, 2014

    Big data, too, brings with it the challenge of multiple streams of data. innocuous data from one stream combined with innocuous data from another combined can result in information that is problematic in terms of being highly useful to hackers and also transgressing privacy requirements

     

     

  33. Hailey Lynne McKeefry
    May 29, 2014

    Welcome, Tech4People, pull up a chair. Chips and guacamole on the table to your left! 🙂

  34. meetsingh
    May 29, 2014

    In terms of the supply chain, clearly the top of the chain {Intel, EMC, NetApp, Cisco, Apple, Samsung} are many years into Hadoop as a tool for optimizing costs, performance, quality, and even product benchmarking-as-a-service for customers. And all these guys have deep and longstanding security controls, some of these are being driven down to supply chain. but that part of the process is early. 

  35. Hailey Lynne McKeefry
    May 29, 2014

    I'm sure the big electronics distributors, Avnet, Arrow, etc. have similar controls in place.

  36. Ashu001
    May 29, 2014

    @hailey-Thanks!thats why hadoop is key.

  37. Hailey Lynne McKeefry
    May 29, 2014

    Do you have any advice for organizations that are trying to tackle the idea of securing big data across the supply chain? Best practices? Potential pitfalls?

  38. meetsingh
    May 29, 2014

    The combination of data definitely poses security risks. You only have to look at the Netflix example, where external data from IMDB, when combined with their data set, revealed user names. One of the capabilities that we are developing is to risk score and count sensitive data, as well as look at the amount of co-joining between data elements. (e.g. Credit Card # is a risk, credit card # combined with CVV and zip code is an entirely much bigger risk.)

  39. Jamescon
    May 29, 2014

    How mature is big data use when you get to the companies below that top tier of Intel, EMC, et al?

  40. Ashu001
    May 29, 2014

    @hailey-They have very good controls in place. I feel they can definitely get better.

  41. Ashu001
    May 29, 2014

    @jimc-clustering,hadoop all can b effective ly outsource d in the cloud today

  42. Hailey Lynne McKeefry
    May 29, 2014

    @manmeet, some sort of standarized scoring would definitely be useful!

  43. meetsingh
    May 29, 2014

    The maturity is still coming for the 2nd tier companies still at the puberty stage

     

  44. dataguise
    May 29, 2014

    @jimc At Dataguise, we have customers who run the gamut of what data they consider most sensitive for their business (customer data accessed/IP/financial data).  We have out-of-box policies for PCI, PII, etc., but we enable organizations to easily define their own policies for sensitive data.

     

  45. meetsingh
    May 29, 2014

    Yes things can be out soursed to the cloud but the problem is pipe to the cloud from enterprses is not that big. So the cloud data is kept in the cloud and rest is kept on premises thats the reason you want big dta as the compute goes to data and not the other way round

  46. meetsingh
    May 29, 2014

    Standardized scoring of the risk and the eventual pitfall for privavcy are really important 

  47. meetsingh
    May 29, 2014

    For Best practice people shoul take a holistic view of the data and what they want to schieve and give privacy and security equl importance from the get go

  48. Hailey Lynne McKeefry
    May 29, 2014

    It seems that there needs to be some combination of controls for security and privacy. part of it has to be technological in nature. Then you also have the need for training and for policies.  I also think that whenever we talk cloud there's got to be a clear understanding between the provider and the organization itself.

  49. Hailey Lynne McKeefry
    May 29, 2014

    In trying to weigh cost and risk together, how should electronics organizations figure out what's reasonable to spend in terms of time and resources compared to the relative risk?

  50. Ashu001
    May 29, 2014

    How will u do standardized scoring here?please elaborate.

  51. meetsingh
    May 29, 2014

    Yes other than technology user and the business have to be trained as its a paradigm shift for them . Earlier the designs were done for them now they are tryng to extract values from the past data and trining for how to use it is important

  52. meetsingh
    May 29, 2014

    Standardised scoring varies from company to company . For some the PII is more important and for other kind of Data the Financils and the account numbers are more important the customer has to define the priority levels

     

  53. Hailey Lynne McKeefry
    May 29, 2014

    in addition to saying more about the scoring, please also say a little about the approach that dataguise is taking with its customers.

  54. Ashu001
    May 29, 2014

    @meetsingh-earlier the designs were done for them?how exactly?

  55. meetsingh
    May 29, 2014

    In big data the biggest plus you get is business users can design what they want from the data (reports etc) on the fly. Where as in the legacy data the data was normalised and fit in the databases in certain ways

  56. JStieglitz
    May 29, 2014

    Generally, customers go through these four steps:

     

    1. Policy goals: what compliance laws, privacy mandates, and data breach/loss prevention are we concerned with. In general, supply chain vendors are going to be looking at sensitive data at the corporate financial level, data residency laws for data that moves across international boundaries, PCI data for purchase and credit card info, and occaisionally personal identifiable info from users of products (PII).

    2. Data profiling: where and how much sensitive data is in Hadoop. Which paths is it following from source through ingestion, into Hadoop, and out to analytics, reporting, etc.

     

    3. Data protection (the heart of the assignment): What data can be deleted (removed), redacted (nullified), masking (replaced with synthetic data), consistently masked (replaced but with a consistent value for joins and analytics).

     

    4. Putting this all together for performance, automation, and data variety in Big Data Three Vs reality of today.

  57. meetsingh
    May 29, 2014

    They could only pull certain kind of predefined reports and not use their ideas – Big data changes that

  58. Ashu001
    May 29, 2014

    @meetsingh-for sure how u manipulate n how u modify data as well as get good results as well.

  59. Hailey Lynne McKeefry
    May 29, 2014

    @JStieglitz, that's a really useful list. thanks! I know that many electronics organizations (or enterprises in general) don't know as much as they should about their data for sure (to step #2)

  60. Hailey Lynne McKeefry
    May 29, 2014

    What are some of the protection schemes are most useful for the data? I know we talk about layered security…but with the breadth of threats, should organizations be thinking more about extra protection for data on the move or is it also important to look at data at rest? is securing the system or the network still useful as a concept or is it more about securing the actual data?

  61. meetsingh
    May 29, 2014

    @tech4people.  This is  question we can anser in our webinar and if you can email us we can go thru it

  62. meetsingh
    May 29, 2014
  63. Ashu001
    May 29, 2014

    @hailey-the sky is definitely the limit here for what can and can't b done with big data n info

  64. Ashu001
    May 29, 2014

    @meetsingh-thanks

  65. JStieglitz
    May 29, 2014

    Hailey,

    That is the whole enchilidada. We are focused on sensitive data proteciton. Really three fundamental services there (discovery (where is it), protection (masking or encrypting), and reporting (tell us what you did, where, when. )

    In general, these protective elements need to fit inside a broader security plan. we tend to borrow the broader Hadoop security framework that Cloudera has put forth, which starts at Perimeter (blocking access to resources via firewalling and VPNs), access control (authorization to Hadoop jobs, roles, resources), data protection for data in transit (SSL) and at rest (Dataguise), and finally, reporting.

  66. Hailey Lynne McKeefry
    May 29, 2014

    @Tech4People, i think we've only just begun the scratch the surface on the potential for insight when you get the right analytics going for what's going on in teh supply chain.

  67. Ashu001
    May 29, 2014

    @hailey-the question of data in motion is crucial as well.u need quality encryption solutions in place.not to mention taking care of the iot.

  68. Ashu001
    May 29, 2014

    Iot=internet of things.

  69. meetsingh
    May 29, 2014

    Yes the volume encryption will not work you need a data centric approach and protect the data as it comes at the cell level

  70. Hailey Lynne McKeefry
    May 29, 2014

    We are nearing the 45 minute mark. any last questions for our guests? Any lst thorughts you want to share, manmeet, dataguise, and JStieglitz

  71. Ashu001
    May 29, 2014

    @meetsingh-at the cell level?u mean like obfuscation?

  72. dataguise
    May 29, 2014

    @tech4people Agree with you — protecting data in-flight as well as at-rest.  Also encryption + masking redaction.

  73. JStieglitz
    May 29, 2014

    Cell level yes. Many protection choices… obfuscation, redaction, encryption (AES), encryption (FPE). In general, we see customers protecting 3-5% of total data.

     

  74. Hailey Lynne McKeefry
    May 29, 2014

    @JStieglitz, that seems really low to me… what level would be ideal?

  75. JStieglitz
    May 29, 2014

    Ideal is the right tradeoff between access, use, and risk. 🙂   It's going to depend on the customer, the vertical, the risks in the data sets. In a large Credit Card brand (#1 in world) we are locking 8 of 50 columns of data, but mind you, that's credit card purchase transaction data. so that's generally going to be a much richer sensitive data set than in GSC.

     

  76. Hailey Lynne McKeefry
    May 29, 2014

    Everything seems to be a tradeoff, doesn't it! that's what it seems to come down to.

  77. Hailey Lynne McKeefry
    May 29, 2014

    I want to thank our guests for some great information, and for our EBN community members for some great questions.  Thank you so much for coming to today's chat!

  78. meetsingh
    May 29, 2014

    5% of a petabyte is still 50 TB of Data.  At Dataguise, we enable our customers to efficiently and cost-effectively find and protect the sensitive data.  Think needle-in-the-haystack problem.  We make it easy to find the needles without having to deal with the entire haystack.

  79. meetsingh
    May 29, 2014

    Thanks Everybody

  80. dataguise
    May 29, 2014

    Thanks for the chat!

  81. Hailey Lynne McKeefry
    May 29, 2014

    That's very true… a bunch of data is available but unused. When you think about it in real numbers, it's daunting!

  82. Hailey Lynne McKeefry
    May 29, 2014

    Thanks and we hope you'll stop by again!

     

     

  83. dataguise
    May 29, 2014

    The haystacks become even bigger when you consider the vast amount of unstructured data that is being added to the mix in Big Data projects.

  84. dataguise
    May 29, 2014

    Thanks @Hailey!

  85. Ashu001
    May 29, 2014

    @jstieglitz-how do u select what data to encrypt and what not to?

  86. Hailey Lynne McKeefry
    May 29, 2014

    @Dataguise–and when you consider to volume of growth over time–more data coming in every day

  87. JStieglitz
    May 29, 2014

    Tech4people: There are several different answers to how to select.  At the technical level, we can be programmed to encrypt specific columns, rows, fields based on the delimeters in a file. We can also discover senstive data automatically (e.g. search and find SOC SEC #s, names, addresses, etc.). This combination of programmatic and discovery-driven tools are the technical ways we do this.

     

    There is also policy considerations around which to select. but that's a question more related to how customers build their security policies.

  88. dataguise
    May 29, 2014

    @Hailey — absolutely!

  89. dataguise
    May 29, 2014

    @tech4people — JStieglitz had to sign off, but we're happy to arrange follow up with you.  Feel free to email us at datasecurity@dataguise.com

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.