Google has been introducing a flurry of upgrades to Gemini (formerly Bard), including enhanced reasoning and competes with other Deep Research/Search offerings from OpenAI, Grok, and Perplexity, to name a few. I casually compared a few of my own research reports against the latest versions of Deep Research/Search from Gemini, OpenAI, and Grok and then took a deeper dive into Gemini Deep Research With 2.5 Pro.
Historically, the main concerns I have had with AI generated search and research have been around source transparency, the lack of direct linkage back to sources, and the very real tendencies for hallucination. These issues, with the releases of Deep Research offerings, are now being more seriously addressed.
Source Transparency
In Gemini (and with OpenAI Deep Research and Grok), the thinking out loud feature is notable. This wow feature in Gemini identifies research focus, key search terms, how/which resources are gathered, how the scope is determined, and nicely helps in defining and categorizing a market being researched. This feature confirms that Gemini will be correctly researching the topic requested before moving forward and allows for editing. For example, here is the “Research Websites” portion of the methodology that will be used on a follow up request on the cold storage (logistics) market: “Are there startups that have recently come out of stealth in the cold storage market?”

Once the report is generated it frequently notes sources with link dropdowns at the end of each paragraph or bullet point to identify the sources used. This is a nice feature but of more value is the works cited list at the end of the exported document, which is a complete list of the sources used linked by superscript number. A few of the resources at the end of the report linked did not exactly match what was written in the report, they were not necessarily wrong, but not as accurate as it could be. However, most were credible matches. There are no dates in the citations listed in this list (unless it is embedded in the URL shown), only the date accessed is provided which is not necessarily helpful and can be misleading while glancing. Here is a section with the dropdown citations and the superscript numbers are shown.

Here is the first page of the works cited list from a request for identification of tariff monitoring tools launched between February 23, 2025 – April 23, 2025. (There were 74 cited sources in total in the list.)

Source Relevancy
Gemini Deep Research (along with other deep research offerings) does not go behind paywalls, thus only citing and using the open access portion of resources, such as from press releases and report descriptions/overviews. (This still can be very helpful information.) What is helpful about this it can glean over multiple packaged market research reports that cover the same market and consolidate the info in a quick manner, cutting down this time intensive process. There are both high quality packaged market research reports and not so great ones out there. The ones picked up by Gemini on my queries were ones I mostly consider to be useful for inclusion and the dates of the reports were current. Because the information accessed does not go behind paywalls, authoritative resources such as academic and business-related databases and market data/analyst offerings are not made available or used.
Results are “surface level” and open access resources can be missed that are below surface level links (requiring additional clicks), such as information found in transcribed text from interviews, podcasts, videos, trade show demonstrations, CEO/CFO/CTO banking/investor conference talks, and text of government contracts and meeting minutes. It will pick up text, with correct prompting, in earnings calls and investor presentations, which are accessed via a single click/direct link.
Sources picked up in most of the requests I ran were solid. Sources consulted were noted to be part of “a multi-pronged approach used to ensure comprehensive coverage.” For example, for a request for new tariff monitoring software tools launched between February 23, 2025 and April 23, 2025, the types of sources consulted were major news wires and press release distribution services for announcements related to tariff monitoring and trade management software launches, and technology and industry publications covering supply chain management, logistics, and international trade, such as Supply Chain Management Review, Supply Chain Dive, Trucking Dive, and Inbound Logistics. In addition, software review platforms and vendor websites were also consulted.
Concerns Remain
It is unclear exactly which sources are and are not available for inclusion in Deep Research offerings. Website publishers have the option to block web crawlers from accessing their content. Many have not provided permission, nor have licensed their content for LLM training purposes. Keeping up with developments in this area, especially with Gemini Deep Research, and other similar offerings is overwhelming. For example, not only has the MIT Press not allowed access to its publications for LLM training but are “aware that many MIT Press publications have ended up in pirated training data sets.”
Even if you have a realization of which major publishers are blocking content for a particular LLM/offering, a disturbing March 2025 study by Columbia Journalism Review reveals that blocking content doesn’t guarantee content is inaccessible and “formal licensing deals do not necessarily translate to more accurate identification or attribution of publisher content.” Even though this is a very current article, it does not include the newer, more advanced versions recently launched or made available, emphasizing the difficulty of keeping up to date in this area.
This blocking of AI bots, according to Jordan Harrod, is particularly problematic when it comes to deep research tools ”because the types of web pages that tend to block their ability to be used as training data tend to be higher authority websites” meaning “the report that you end up getting means that the sources of information that end up being the ground truth, so to speak, of the output that the model returns to you are lower quality or at least skewed towards being lower quality.”
It has been pointed out by Ms. Harrod and several others that deep research tools are more effective with those who have greater subject expertise in the areas being researched. These individuals can more easily verify if sources are relevant and credible. Subject matter experts most likely can identify resources that are of higher quality regardless of which side of the paywall they fall. But most important, the ability to know which prompts to use during the process is key. I am in agreement with Sam Edelstein’s observation of OpenAI’s offering that “ultimately the direction Deep Research takes will only be as good as the prompt you give it.”
Image by Friedrich Teichmann from Pixabay