Making online services safe and secure for more than a billion people means that security solutions have to scale well. Recent internet-wide incidents involving SSL technology, such as POODLE and Heartbleed, only reinforce the importance of getting this stuff right, as well as the extent to which security technology impacts more than any single company.
For our second Security @Scale conference, we wanted to openly discuss and learn from the different ways companies such as GitHub, HackerOne, Square, Twitter, and Facebook are solving the same problems. Our goals are ultimately the same: finding better engineering solutions so our front-end teams don't have to always think about writing security into their programs but can benefit by default. Our jobs, as security team leaders, encompass everything from incident response and product consulting to compliance and, more generally, cleaning up all the code on our networks.
At the event last Wednesday, we previewed where security was as far back as 2010, how far it has come, and how much we still have to accomplish. Check out the talks below to hear from some of the industry's leading voices on security solutions that scale.
Presentation Summaries and Videos
Opening Remarks
Scott Renfro of Facebook's Security Infrastructure team welcomed the group to this year's Security @Scale by discussing the state of cyber security and his experience with past Security @Scale conferences.
Mutiny on the Bounty: Lessons Learned in How Data Defeated Dogma
Katie Moussouris, now Chief Policy Officer of HackerOne, dove into the development of Microsoft's bug bounty program, which she pioneered over three years of looking at data starting in 2010 and announced in 2013. Her talk showed how the game theory, economics, politics and data turned heresy to gospel at the world's largest software company. Katie's talk focused on alternate bounty models and levers that don't require vendors to be the highest bidder to create successful structured incentive programs that bring about specific strategic outcomes. Katie walked though the motivations of hackers outside of monetary compensation and showed the success of the programs she created, having paid out $253,000 since June 2013. She stressed that organizations who offer bounties have choices in factors like timing and turning a thin market into a thick market, and how researchers should not have to choose between doing the right thing and getting paid. She then wrapped up by previewing how to set up a bounty program based on data plugged into any organizations models and goals.
Human Botnet: Scaling Your Security Organization
Diogo Mónica talked about Square scaled vulnerability management, access control and security monitoring initiatives while still effectively managing the organization’s risk. As the organization adds employees, the number of hours Security Engineers sleep every night continuously decreases. This phenomenon happens due to the increasingly hard task of ensuring security against malicious actors, both internal and external, as the company increases in size. Diogo presented many different systems that Square built internally to effectively scale the security team’s job. In particular, he introduced Report Card, a tool to socialize the security status of a project within the company; Doorman, a centralized 2-Factor SSO that allows self-enrollment and expiration of arbitrary capabilities; and Sting, a security alerting tool that distributes the load of dealing with low SNR alerts throughout the engineering organization.
Better Large Scale Rule Engines with Haxl
Louis Brandy of Facebook's Site Integrity team kicked off a discussion about scalable spam fighting and the anti-abuse structure at Facebook and Instagram. He stressed the importance of focusing on systems instead of spam itself, and how his team at Facebook has focused on progressive refinement in developing an efficient scalable rule engine powered by a domain-specific language called Haxl. Haxl works by allowing the user to write a simple and expressive set of rules, and have those rules be run with aggressive I/O scheduling and optimizations. Haxl automatically explores the computation to batch multiple requests to the same source, fetch from multiple sources concurrently and cache and deduplicate identical requests. This creates a simple and expressive rule language that executes optimally and reduces latency.
SSL++: Tales of Transport-Layer Security at Twitter
Twitter's unofficial motto, as explained by Jim O'Leary, is 'all TSL everything.' During his talk, he detailed the process of switching an entire user base from HTTP to HTTPS across Twitter and third-party clients. Since 2011, there have been lessons learned, including the process of introducing security features (from opt-in to opt-out to completely deprecating old insecure methods). Jim also describes a few tricks for protecting users: secure indexing in search engines, Strict-Transport-Security, Content-Security-Policy, and certificate pinning. Twitter has open-sourced the secureheaders library so that developers also benefit from secure defaults in their projects. The talk ends with a walk through some of the recent vulnerabilities in SSL and how a large organization like Twitter responds to them.
Host Intrusion Detection with osquery
Mike Arpaia announced Facebook's latest open source project, osquery. Osquery is a simple, performant and reliable, easy to integrate and flexible solution to catch attackers for defenders at scale by using SQL queries to explore OS states. The project is broken into two smaller products, osqueryi and osqueryd. Osqueryi exposes tables via an interactive SQL shell, which allows you to join and aggregate information across several simple tables. Osqueryd is a daemon for low-level host monitoring and determines how the results of a query change over time. He then closed the talk with a description of hosting even pub/sub stream and operating system introspection and how the team uses homebrew to manage dependencies.
Homebrew Incident Response
Facebook's Michael McGrew began where Mike Arpaia left off with a description of homebrew incident response. He explored the state of affairs for incident response (IR), including the positives and negatives of how many companies are approaching IR. This is important because no company is immune to a data breach. Michael went through basic questions that IR teams will need to ask of their environments during an incident. Incident response relies heavily on preparation; otherwise, you may not have logs that answer your questions. He went on to talk in-depth about Facebook's Network Security Monitoring platform and the components that comprise the stack. The takeaway was that if companies follow established IR guidelines, prepare for breach by instrumenting their network and their systems and build tools and infrastructure to automate as much as possible, the faster they will be able to identify, contain and eradicate threats.
Building Your Own DFIR Sidekick: ChatOps for Incident Response
Scott Roberts described to the crowd how he and the rest of the team at GitHub were able to develop security capabilities for their company chat bot called Hubot. As a result of the team's vast geographic distribution, difference in operating device, and ability to time shift, most of the communications—including incident response—rely on what he calls ChatOps. Team members can monitor code that's currently running, change firewall tools, and even ban users. The collaborative nature of ChatOps is comparable to World of Warcraft, he explained, and is a multi-device, teachable way to communicate. This Hubot is based on Nodejs, which makes developing new security tasks something everyone on the team can do. He then explained the creation of Hubot and the many examples where Hubot is deployed.
We hope you enjoyed this set of talks from Security@Scale. Facebook's @Scale conference series was created for members of the technology industry to discuss lessons and best practices for working at large scale, and we will post more here when we have updates to share.