Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Launch HN: Overwatch (YC S22): OSINT platform for cyber and fraud risk
164 points by Bisen on June 12, 2024 | hide | past | favorite | 89 comments
Hey HN! Arjun and Zara here - cofounders of Overwatch (https://www.overwatchdata.ai), a platform to automate OSINT and threat intel, turning it into actionable insights. Check out our clickthrough demo here: https://app.storylane.io/share/qyayvtamapis.

Overwatch began when we were working with risk and threat intel teams at Google, Stripe, and government. We experienced the immense challenge every fraud and cyber threat analyst faces: manually parsing through an ocean of data to find valuable insights and filter out the noise. This included using many of the feeds and tools out there that were often very expensive, noisy, keyword-based, and lacked accurate entity extraction or advanced query features.

Most threat intelligence tools utilize thousands of keywords and teams of analysts to manually sift through torrents of alerts. These alerts are usually individual posts on various platforms across news, social media, deep and dark web sources that have some matching keyword. This is full of false positives, requiring many hours to wade through to figure out what intel matters most to our users, why, and what they can do next.

Overwatch uses an alternative approach by layering AI agents and NLP techniques, including a combination of multifarious datasets, cluster analysis, topic modeling, Retrieval Augmented Language Models (RALM) and domain knowledgeable agents.

This allows us to (1) Filter through OSINT in real time to identify events and narratives that matter to our users, and write reports on what they could do about it; (2) Identify dark web and deep web threats, fraud methods, new tactics, and compromised accounts, stolen checks, and credentials affecting our users or their peers; (3) Send an alert any time one a 3rd party supplier or parts of the tech stack are impacted by a widely exploited vulnerability, ransomware attack, or breach; and (4) Track malware and ransomware groups that are actively targeting your industry including Indicators of Compromise (IOCs).

Our intelligence is actionable because the alert comes with the context and important details that an analyst needs to make an informed decision. Being AI-native, we also have a range of chat and data visualization features to effectively function as an intel co-pilot or industry expert. Finally, our in-house intelligence analysts and investigators can assist threat intelligence teams with HUMINT investigations and darkweb acquisition.

Our current customers include internet platforms, financial institutions, and supply chain companies. Within a day of one breach, one of our customers used Overwatch to surface 18,000+ leaked credentials. Another used us to surface fraudulent checks and learn exactly how threat actors were targeting their specific product features.

Our website says “Request a demo” but if you want to poke around on a very basic example of how we’re aggregating dark web, deep web, social, and surface web, log in at https://app.overwatchdata.io/ using these credentials: username: try_overwatch@overwatchdata.io pw: HelloHNWorld

That login is for an un-personalized feed of cyber threat intel (breaches, vulnerabilities, ransomed organizations, and industry updates) that gives you a flavor of not just the kind of information from which we can collect, but more importantly, how our technology prioritizes, clusters, and summarizes alerts for cyber / fraud analysts. Try the chat agent on the left-hand side to parse through the data.

Or sign up for a longer trial and preview of our email alerts: https://xryl45u9uep.typeform.com/to/pvtZQyS0. You can also check out our clickthrough demo for dark and deep web intelligence: https://app.storylane.io/share/qyayvtamapis.

Integration options range from simple dashboard access to our API for those who want to weave our intelligence directly into other products. Pricing is dependent on how complex a threat landscape our users want to monitor and we’re still figuring out how to standardize this but we’ll always do our best for the HN community.

Since the platform is AI-powered, it can also be used for news monitoring, supply chain disruptions, regulatory monitoring, or social media monitoring. We’ve had a lot of experience wrangling text-based feeds and using numerous AI-models (from embeddings, entity extractors, and LLMs) to filter, categorize, cluster, and analyze the data into meaning - so let us know if you’d like to nerd-out or have had any particular challenges. Looking forward to your feedback and questions! Thanks, HN!



Using RAG is definitely a relief factor after reading that you're using AI and NLP for aggregate analysis, but I'm curious how much manual review this actually saves?

Since the model summaries would still need to be validated against the source results manually, your business' actual viability as a product hinges on whether customers perceive a significant time savings in the data provided via these channels over historical aggregation methods (like keyword analysis that you mentioned) and level of false positives.

What do you measure as the largest impact here? Is there a large time savings, is it additional discovery from blindspots that other methods don't cover? Both? Are there additional benefits you see to this model beyond automation and expanded discovery?


Some of our customers said they spend around 3hrs a day on navigating new vulnerabilities alone. Step 1: wading through info from some easy and some hard to access sources; Step 2: trying to bring together all the most relevant information. E.g. just detecting your payroll provider got ransomed is just one step, then you have to research the group, any indicators that you might also be infected etc. then step 3: what do I do next? e.g. adding relevant hashes to virus total.

We not just help with relevant detection, but also automate some of the next two steps as well. Bringing the total weekly time saved down to a few minutes a day.


The discover is another important value add - AI Agents can monitor beyond human scale across more data sets to do the initial triaging for you.


I would expect a landing page to show summaries. Largest organizations impacted in last 7 days, most active exploit, etc. Instead all I see are events - apparently including tweets as a source - with minimal context. Just do what you advertise. Show me the latest breaking info from the dark web. Who is impacted, how much, and what was the vector? Better to be sorted by magnitude of impact rather than strictly chronologically. Bonus points if you consider when the user was last logged in to your platform: for people that last viewed your content a month ago, here are the biggest events from the last month. Same for weekly, daily frequencies.

That said, love the initiative and focus on this space and there’s probably an opportunity to sell your data to hedge funds.


Great points - can say this is all 'coming soon'! One note, because this profile isn't personalized to a particular user's products, tech stack, and 3rd party/ supply chain, it's especially chaotic. For a more tailored profile, the social, news, and dark web posts all cluster around a specific events since there are far fewer critical events of interest to a specific user. No excuses, just sharing for background. Interesting point on the hedge fund use case, haven't been able to find a good user/ persona to interview about that and would love any suggestions if you have any. Thanks again for checking us out.


Sort of off topic question. But how would you get into the type of work that uses this tool? I've always thought this type of work would be interesting, but I have no ideas where to start. What are the job titles? Fraud Analyst? Thanks!


Titles vary greatly but the general domain is information security and cyber security. Here is a good primary of basic qualifications and duties for a given role (Generic, not tuned to a specific company) - https://www.cyberseek.org/pathway.html

This type of information (OSINT of vulns/cves, proof of concepts) is useful for the Blue team side of defending against attackers. With easy to access information in a timely manner, the defenders can proactively put roadblocks and alerts into place for vulnerabilities as opposes to AFTER they are popped/hacked by such.

Prevention is ideal; detection, a must.


There's no getting into anything you haven't already been doing 6+ years right now.


I love it! We’ve seen the job title change a bit depending on the sector but can be under fraud analyst, threat intel analyst, risk analyst, financial crime analyst, sometimes trust and safety teams.


What is the pricing to monitor per each keyword?

I know platforms like Flare are cool but when you need to monitor hundreds or even thousands of corporate keywords, domains, and assets, it becomes cheaper for CTI to just write the tools themselves.

What does your platform look like in regards to this and pricing?

For example, your pricing for monitoring 100 keywords and pricing for monitoring 500 keywords.

200k unique telegram channels is an interesting stat.

Each Telegram account (if paid Premium account) can only be in 1k channels and groups max. To monitor 200k unique channels/groups, you have a network of at least 200 paid Telegram accounts continuously monitoring? Are you using Pyrogram or Telethon for this? Are these accounts owned by you (Overwatch) or are you just using a bunch of 3rd party Telegram intel feeds?


We totally hear you and that’s why we don’t really charge by keyword but instead look at how many agents we need to deploy/ use cases we build towards. A 1000+ assets is the norm for our users. Would you be interested in connecting on a call to better understand the use cases and tell you more about how it works?


Not really because we just build all out stuffs out in house.

I would like an answer on the Telegram questions tho :)


I worked with Arjun in trust and safety / risk at both Google and Stripe. He’s not only an expert in the space, but is incredibly users-first. If you’re looking for a product like this and want a great partner, Arjun and team are it!


Congrats on the launch! As design feedback, the demos don't seem to pass the "squint test" for intuitively surfacing the most important information / actions on the screen. Maybe a more specific walkthrough of a redacted / hypothetical scenario that's focused more on the user's decision-making process & actions instead of the kitchen sink of product features would better illustrate how/why things are laid out as they are currently.


Also storyline type demo sucks. A simple video that I can rewind will be preferred.


Thanks for the advice. Maybe a loom video would have been better with a walk through of a single use case. Taking it on notice for next time!


This feels a bit like a turbo RSS reader that plows through some easy and some difficult to access information and actively selects and targets it to subscribers?


That's definitely the vibe of the example we're presenting. But for specific customers, the agent can cluster and bring additional relevant context to each of the 'events', and even recommend actions / automate certain actions e.g. if you detect a compromised account, send it team X.


FWIW, clicking around, there are some odd display issues in the "References" (h<b>ttp</b>s://attack.<b>mitre</b>.org/techniques/<b>T1486</b>)

It looks like you're embedding data from Twitter - are you paying for decahose/enterprise access or just paying for a low volume of high value tweets (i.e. I'm seeing many from FalconFeedsio, DailyDarkWeb)


I can jump in on the data source question - we can track specific accounts and keywords on Twitter, we aren't paying for the full firehose yet. We also track those original ransomware and dark web sites and blogs being referenced and just figuring out how best to cluster them all into the same event. Thanks for checking it out!


What is your detection hit/miss rate? What happens when you miss something?

Seems like this is going to become a cat and mouse game similar to evading AV.


Great point! From some of our case studies we see users catch 25% more 3 days earlier than other solutions.

To your point catching every threat or every alert especially on darkweb is always a cat and mouse game. Our idea is a prioritization problem – how do you mitigate the biggest risks quickly.

The existing OSINT tools we used are keyword search based / pretty noisy so we’ve been focusing on the idea that given there’s no way analysts can find or triage every alert, how do you catch the biggest stuff. We do a few things from AI crawlers to continue to expand data collections to AI categorization, clustering, data extraction etc to make it easier to track the cover the most ground.


I find it interesting that you didn't answer my question at all, tbh.


I doubt they provide any kind of guarantee that they'll catch everything. That's not what a tool like this is used for (i.e. it's not a security tool or a network monitoring service).

OSINT is about gathering data from public or semi-public places.

There are plenty of private dark web forums that these kinds of tools probably don't have visibility into, but the bigger public well-known ones are where you are most likely to see high-profile breaches being sold (since there's a wider audience and actors want to sell data quickly before it devalues), so it's certainly better than nothing...


I never asked for a guarantee, that's not feasible, I was looking to see if in-house analysts or engineers can review the data they're using or lower confidence results as a second set of eyes.


Thanks for clarifying - honestly, we're a tiny startup so not really here to play PR games, just didn't fully understand the question. Users can 100% review all the results even if they aren't ranked as 'high priority'. You can even free form search our repositories with boolean strings like you would any other OSINT tool, but with the added benefit of an AI agent to help triage, if you so wish.


Ah well, your question wasn't clear. But your clarification helps!


Dodging questions and proving a non-answer answer is always the sales and c-level route when they just dont want to answer you (or know the answer).


it probably varies widely by use-case and customer


Very interesting. It sounds like the tool is broadly powerful in combining a threat intel dashboard + news digest processor + AI features to better customize the output of the first two. The details of the API output will be important to many of your customers, as will the richness of the sources covered (forums and Telegram channels often die out and the "buzz" starts to happen in a different place, etc). Like some other commenters said, this is a fairly vendor-saturated space, so as a buyer I'd be looking for sharply presented distinction factors, beginning with price rather than AI (which is still a good thing to have).

I have a lot of experience with this kind of tool and workflow from at least three perspectives: internal builds; vendor; and consumer of vendor products such as this one. Happy to talk more if you're interested


I'd love to take you up on that offer. Hn doesn't reveal emails but we have one listed here that directs straight to me, in case you're able to drop us a line: https://www.overwatchdata.ai/request-a-demo Looking forward to learning from your experience!


Great, I'll write you an email later tonight


Congratulations on the launch. I noticed that you guys are SOC2 type II certified. I wonder how did you achive that so fast?


Thank you! It definitely didn't feel quick. We had a 3 month audit window and we used Vanta.


Waiting that 12 months to really demonstrate you have a working security program with efficient controls really pays off. It's something I look for when doing vendor reviews and I assume others do the same.


For the first SOC2, I don't hold this against a startup (I appreciate they are going through the efforts this early). Would want to see it become 6 month/1 year as the program matures. A vendor like this is low risk (aggregator of "public" information, limited data sharing, etc).

I have all sorts of issues with Vanta/Drata "compliance as a service" tools, but adequate for something like this, at this point in time.


Tbf, I’ve found it’s a good sign when an org goes through this pain early on - less chance for tech debt to pile up.

Most of my employment has been in the security auditing/testing space, and the difference between “bolting it on later” and “building it in from the start” is incredible from both a purely technical and a process standpoint.


That makes sense, we're going through our annual renewal now. It's a great experience to harden and test systems.


This looks great, I am seeing a lot of potential use cases here. It also pools together a lot of the stuff that goes unreported in the news.

Would this be a service you would ever offer to regular researchers?


Definitely! Would love to chat more. A lot of our users want more customized monitoring / agent versions of this, and our dashboards are pretty easy to customize.

Opening up certain components of the platform is something we are definitely looking into and passionate about.


Pretty cool, like the fact you tie all types of feeds (threats, vulns etc) together into a single view. How are you guys different from the plethora of other TI platform vendors out there?


In 3 important ways besides price: -Personalization: The platform can be fully customized to your interests, e.g. 3rd party vendors, tech stack, products, peers or industry. It’s like having your personalized threat intel org that cuts through the noise. Each of the alerts are ranked and tailored to your interests. -Customization: The platform can be used for a range of use cases, with agents undertaking tasks ranging from finding and extracting check, loan, and credit card fraud, risky narratives about your brand, through to breaches or emerging ransomware groups targeting your tech stack or vendors. Those agents can even identify breaking events that could be near your assets. -So what and now what: Each report provides finished intel, whether finding or extracting relevant indicators of compromise, background on the threat actor and victim, compromised credentials, or compromised credit cards and checks. We can even automate workflows through integrations or creating cases/escalations for specific teams.


Good answers. I think one of the big themes in cyber for the next X years is what work can you automate or do on the security team's behalf - and you tick a lot of these boxes. Best of luck!


How is this different from Interpres?


In 3 important ways besides price:

-Personalization: The platform can be fully customized to your interests, e.g. 3rd party vendors, tech stack, products, peers or industry. It’s like having your personalized threat intel org that cuts through the noise. Each of the alerts are ranked and tailored to your interests.

-Customization: The platform can be used for a range of use cases, with agents undertaking tasks ranging from finding and extracting check, loan, and credit card fraud, risky narratives about your brand, through to breaches or emerging ransomware groups targeting your tech stack or vendors. Those agents can even identify breaking events that could be near your assets.

-So what and now what: Each report provides finished intel, whether finding or extracting relevant indicators of compromise, background on the threat actor and victim, compromised credentials, or compromised credit cards and checks. We're training our agents to be even more specific with answers e.g. "what IOCs relate to malware groups most active in the airline industry". We can even automate workflows through integrations or creating cases/escalations for specific teams.


How do you compare to incumbents like Blumira? What does your mitre coverage look like?


EDR's are a great way to help secure endpoints but high fidelity threat intel which is tailored to your environment and org's needs can help increase awareness and shine light on potential security blindspots. This is especially critical when the threats are ever evolving and time to exploit is decreasing year over year. Qualys in a 2023 report stated that "25 percent of these security vulnerabilities were immediately targeted for exploitation, with the exploit being published on the same day as the vulnerability itself was publicly disclosed. They offer some outside the perimeter threats but by reputation, it’s a weakness and narrowly targeted to your organizations credentials and vulns, and orgs usually still need a threat intel provider. For example, one of our users who already uses an EDR, may not know about a 3rd party that’s been ransomed by a threat actor e.g. APT 73. An alert from Overwatch saying a 3rd party has been compromised will also include information about recent IOCs e.g. hashes and file extensions attributed to that threat actor so that the user can add them to virus total and scan internally to make sure they haven’t been compromised. This is an example of how EDRs and threat intel can work in concert.


How is this different from RecordedFuture? Other than the obvious RAG capabilities


RF contracts are heavily services based and cost up to 7 figures for tailored intel across a number - we delivers a similarly personalized experience but for a fraction of the price. AI agents can also do a range of additional and customizable tasks e.g. bringing together relevant context about a threat actor, tracking fraud methods, compromised checks and cards, narrative analysis, geopolitical disruptions etc. They can also be automated to create new escalations and actions through integrations. It's like having RF data as well as digital analysts to do a lot of the leg work for you.


thanks, that's super helpful! godspeed!


Would love to chat with one of the founders. Can you send me an email? (profile)


Just an offtopic heads-up that emails in HN profiles aren't visible to other users, only to admins.

If anyone wants to share an email address with other users, it needs to go in the About box.


Wow. Thats a really fresh concept and an important need. Congrats on the launch.


Thanks. Who is the target user for this kind of tool? A CISO team member?


Our current users are CISO's security ops team, threat intel team, blue team, fraud strategy or fraud intel team. Hope that helps!


Seems like alerting in threat intel is getting disrupted by AI - Cool.


all my problems with risk analysis tools are false positives. adding AI just sounds like there will have more of them and harder to figure out when it happen.


Our whole aim is to make you happy by downranking those false positives while remaining explainable. We blogged about the explainability part in case you're interested: https://www.overwatchdata.ai/blog/the-imperative-of-explaina...


Arjun and Zara are amazing! They’re our batch and group office hours mates from YC S22.

We (https://www.newscatcherapi.com/) also serve the same use case but only for the news analysis part. And we don’t really have a UI: it’s all data accessible via an API.

I see a lot of questions here about comparing Overwatch to other OSINT tools. The ability to customize/personalize is a huge difference.

In my experience, clients with the most expensive problems are super underserved because there is no “Palantir-like” solutions.

Don’t get me wrong: you don’t have to do consulting — just tweak the onboarding/set up. Making bespoke solution for the companies with the biggest problems is a great way to get into the market. And it can work on the huge scale. E.g. Palantir.

An example from what we have as a very typical situation at NewsCatcher: a big bank is absolutely blown away because we actually can find news about private companies that they need to track with minimum false positives. And all we have to do is to tweak a bit our entity disambiguation module to work with the data points that the bank actually has.


Congrats on the launch! Not to sound negative about it but you do realize Overwatch is a trademark of Blizzard Entertainment …


Great team, great launch!


Congrats, look very good!

I would hesitate with the name though. Overwatch is also a game series from Blizzard.

And Blizzard is known to be a little sue-addicted.


AI tool with the same name as a game where a robot uprising is one of the main backstories seems like it could be a bit problematic...


Haha! It's also an SEO nightmare. The term Overwatch is a force protection tactic, where you have one team providing extra assistance and cover to the main operation. If only we had played the game first before naming ourselves!


Use case: Overwatch to track Overwatch (Blizzard) boosting, account, and cheats black market.


should have at least named it Overwatch 3 to get some SEO benefits


Ha! Damn, missed a trick there.


Isn't Dota 2's anti-cheat system called Overwatch? If Blizzard didn't go after that, they aren't going to care about this.


Valve used the name Overwatch since 2013 (before Blizzard) for their player-run jury system for cheaters.


I disagree - I think the name is great.

Consider Venn diagrams: the audience for these two homynymous products has small overlap. Further moderated when you consider term stickiness to respective meanings is only high for a small fraction of that audience.

In other words, most people aware of both can cope with a mutual name. And some may even think it's cool. Each name enhancing the other through association and analogy.

Overwatch is definitely the right choice. Consider the 'OG' meaning originates from military terminology. In this context, "overwatch" refers to a tactical position where one unit provides covering fire and surveillance for another unit as it moves forward or performs an action. The overwatch position is typically elevated or strategically placed to have a clear view of the battlefield, allowing the overwatching unit to detect threats and engage enemies to protect the advancing or exposed units.

This concept ensures that the moving or vulnerable unit can operate with reduced risk, as the overwatching unit can neutralize potential dangers and provide critical information about the surroundings. The practice of overwatch is a fundamental tactic in military operations, emphasizing teamwork, communication, and strategic positioning.


I’m not sure if you are familiar with Overwatch the game, but it is a very famous massive game. My gut is that a majority (if not a huge majority) of people in tech aged 20-40 will be familiar with Overwatch, the game. This will just make following a conversation about the product unnecessarily difficult. The existing mental and SEO associations are radically strong, regardless of theoretical reasons for why “Overwatch” is a good name for the present product


I guess to put it in programming-speak which you might find a bit easier: these two names are 'out of scope' of each other - hahaha! :)

I don't think of what you call the 'theoretical' aspects as being purely theoretical, I think they're practical and conceptually associative and have provided a great name.

I do know of the game (but I haven't played it haha!), but I don't think it's a problem. I understand that you do. We'll see! Hahaha! :)

I'm no expert in SEO but it seems like it could go both ways and even boost it. For people's mental aspect, think the recall of this term will be boosted by there also being a famous game, and I don't think people will be confused by these two different categories.

What makes you think they will? What other examples have you seen where this happens?


Its a great name, it is also a trademark. I dont know enough IP law to know if this is an issue. https://trademarks.justia.com/862/39/overwatch-86239314.html

Try it and find out?


Meh. That is just the Blizzard trademark; there are several other Overwatch trademarks that are military and security related that are probably a more likely overlap.

Definitely something to consider, but more in the "send a note to a trademark lawyer and marketing folks" than "rename our entire product and brand".


not saying the name is great. I am just saying maybe you don't want to get sued by a huge multi-billion dollar giant who likes to sue.


I disagree. I'd be like: Bring it on. Play the fake victim card. Make a few showy social media posts, posting select screenshots of email chains. Rally everyone around the 'evil greediness' of megacorps. Not only do you score PR sentiment wins, you get a massive PR blitz. "Tiny US cyberintel startup sued by massive Chinese-state-linked online game" (I don't actually know if they're Chinese-state-linked but I'm channeling my inner modern-day journalist hahaha!)


Maybe split the difference by naming Sombra the CTO


It is also, and more to the point of being a real trademark issue, the name of a managed threat hunting service from Crowdstrike.


Doesn't the government have a program named overwatch as well? Something in the air force cyber transport division...


Trademarks are namespaced by subject matter.


Yep, Blizzard's trademarks are for:

> Computer game software, [ computer game discs, ] downloadable computer game programs, computer game software downloadable from a global computer network, electronic games software for wireless devices, interactive multimedia computer game programs, mousepads, computer mouses, headsets for use with computers, computer keyboards

> Printed matter, namely, [ computer game strategy guides, ] comic books, graphic novels, novels, art books, calendars, posters, [ notebooks, ] and stickers

> Entertainment services, namely, providing on-line computer games; Providing computer games that are accessed via a global computer network

> Clothing, and headgear, namely, caps, hats, hooded sweatshirts, jackets, sweaters, and T-shirts

> Toys, games, and playthings, namely, action figures, collectible toy figures, dolls, plush toys, and vinyl toy figures

https://tsdr.uspto.gov/#caseNumber=86239314&caseSearchType=U...

https://tsdr.uspto.gov/#caseNumber=86434530&caseSearchType=U...

https://tsdr.uspto.gov/#caseNumber=86239318&caseSearchType=U...

https://tsdr.uspto.gov/#caseNumber=86980853&caseSearchType=U...

A search for "overwatch" currently matches 191 registered trademarks in the US.


thanks Kube said it lot better than I did here. :)


I'm saying they shouldn't be worried about being sued by Blizzard. This software is not a video game or related merchandise.

Trademarks do not cover all uses of a word -- they only cover the use of a word in relation to a particular field of commerce. This is, likewise, why your grocery store can sell apples without being afraid of the equally litigious Apple Inc.


But they still would need to watch our for the even more litigious Apple Corp.


What about WWF? I always thought you'd have to be a complete imbecile or willfully dishonest to confuse the World Wrestling Federation with the World Wildlife Fund, yet here we are.


The number of times I've seen someone market a product called "The Matrix" since 1999 would tell me trademarks like this aren't a big deal.


how many you thought it is "Overwatch" multiplayer game? :-)


Now there’s “Overwatch” the game, “Overwatch” the anticheat platform, and “Overwatch” the OSINT platform!


I hope it's not as bad as Overwatch 2 /s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: