close
close

HoundDog.ai helps developers prevent personal information leaks

HoundDog.ai, a startup that helps developers ensure their code doesn't reveal personally identifiable information (PII), emerged from the shadows on Wednesday, announcing a $3.1 million seed round led by E14, Mozilla Ventures and ex/ante, as well as a number of angel investors. Unlike other scanning tools, HoundDog actually looks at the code a developer writes, using both traditional pattern matching and large language models (LLMs) to find potential problems.

HoundDog was founded by Amjad Afanah, who previously co-founded DCHQ, which was later acquired by Gridstore in 2016 (which then changed its name to HyperGrid, to complicate matters further). Afanah also co-founded apisec.ai, which is still active, and worked at self-driving car startup Cruise. The inspiration for HoundDog came during his time at data security startup Cyral and conversations with data protection teams there, he told me.

Photo credit: HoundDog.ai

“When I was at Cyral, we had a lot of data,” he said. “Cyral – like many others in the data security sector – focuses on production systems. They help you discover and classify your structured data and your databases and then apply access controls. But the overwhelming feedback I kept getting from security and privacy teams alike was, 'You know, it's a little too reactive and doesn't keep up with changes in the codebase.'”

So HoundDog shifts this process even further to the left. While it's still in the continuous integration flow and not yet in the development environment (although that might happen in the future), the idea here is to find potential data leaks before the code is merged. And most importantly, HoundDog does this by looking at the actual code, not the data flow it generates. “Our source of truth is the code base,” Afanah said.

Photo credits: HoundDog.ai

For example, if a development team started collecting Social Security numbers, HoundDog would raise a flag and warn the team about it before the code was ever merged; it would also alert the security team. This could be a big – and costly – problem after all.

The service currently supports code written in Java, C#, JavaScript and TypeScript, as well as SQL, GraphQL and OpenAPI/Swagger queries. Support for Python is imminent, the company says.

Afanah noted that such a tool becomes particularly important in the age of AI-generated code, which Replit CEO (and HoundDog angel investor) Amjad Masad also supported.

“As more companies use AI-generated code to accelerate development, it becomes imperative to incorporate security best practices and ensure the security of the generated code,” said Masad. “HoundDog.ai is a leader in securing PII data early in the development cycle, making it an indispensable part of any AI code generation workflow. This is why I decided to invest in this company.”

However, HoundDog itself also uses AI. It currently relies on OpenAI's models to do this, but it's important to stress that this is optional. Users who are concerned about their code leaving their private repositories can also choose to rely only on the company's more traditional code scanner.

A key part of HoundDog's value proposition is that it can reduce compliance costs for startups thanks to its automated reporting capabilities. The service can automatically create a register of processing activities (RoPA). To do this, HoundDog uses generative AI to create these reports and sends this data to OpenAI. The team emphasizes that only the tokens that the service has discovered through its regular scanner are shared with OpenAI and that the actual source code is not shared.

The company offers a limited free plan, with paid plans starting at $200/month for scanning up to two repos.