Skip to content
The Times USA
Menu
  • ABOUT
  • CONTACT
  • LIFESTYLE
  • NATIONAL NEWS
  • BUSINESS
  • INTERNATIONAL NEWS
  • TECHNOLOGY
  • PRICE OF BUSINESS SHOW AUDIOS
Menu

Misconceptions About Enterprise Search

Posted on August 23, 2022 by admin

By Elizabeth Thede, Special for The Times USA

 

Misconceptions about enterprise search abound. This article will attempt to resolve some common ones—and get you on your way to instantly searching terabytes.

The first misconception is that unindexed search is as good as indexed search. For example, the application dtSearch® offers both indexed and unindexed search. However, indexed search is far and away the gold standard. Indexed search is instant, even across terabytes of content and even in a multithreaded concurrent search environment such as with a network installation, on a local web server, or in the cloud.

Beyond the speed of indexed search, it also enables more search options. Most of the 25+ dtSearch search options cover both indexed and unindexed search. But indexed search has some extra search options as well, like the ability to flag credit card numbers that may appear in indexed data. The indexer can run a series of numbers which might represent a credit card number through a validation algorithm to determine if it is actually a credit card.

The next enterprise search misconception is that building an index is somehow hard. In reality, it couldn’t be easier. All you need to do is point to the folders, email archives, and the like to index, and the search engine does everything else, reviewing each file in its binary format. From the binary format, the search engine determines the applicable file type. After figuring out the file type, the search engine uses the file format specification for that file type to recognize all full-text and metadata.

Beyond storing each unique word and number in the data, the index also stores information on the location of each word and number. A single index can hold up to a terabyte of text. There are no limits on the number of terabyte indexes that the search engine can create, and end-users can instantly concurrently search.

For changing datasets, the search engine can use the Windows Task Scheduler to update indexes as often as you like. To update an index, the search engine need only re-index files that have been added, deleted or modified since the previous index build. Updating an index does not block out individual or concurrent searching, so all searching can continue unaffected during the update.

The next misconception is that a search engine will incorrectly handle files with a mismatched file extension, like a PDF saved with an .DOCX file extension. It is true that a search engine needs to correctly identify the file type of every file to determine the relevant parsing specification to apply. But a search engine can figure out the applicable file type from the binary file itself, without reference to the file extension at all. In fact, the file extension is extraneous to this process.

The next misconception is that a mistype will thwart a search engine. Say you mistype Mississippi in an email, maybe adding or deleting an extra S or mistyping a P as a Q. But fuzzy searching adjusts from 1 to 10 to accommodate text deviations. Even a low level of fuzzy searching would pick up any of these Mississippi mistypes. Fuzzy searching works alongside other search types, like Boolean and/or/not searching and proximity searching, so it is easy to just keep fuzzy on at a low-level while searching.

Why not leave fuzzy searching on at a high level? While a higher level of fuzzy searching will pick up the largest numbers of typographical and OCR deviations, it also finds false hits. At some point, Mississippi with a high enough level of fuzziness is also going to pick up Missouri, so it’s a trade-off.

The next misconception is that text that is obscure in an associated application display will be equally unapparent to a search engine. If you look at a standard file—PDF, Word, Excel, Access, PowerPoint, OneNote, etc.—in its native or associated application, white text against a white background, black text against a black background and the like can be very hard to spot. But in binary format, black on black or white on white is just as apparent as regular black on white writing.

Likewise, certain metadata is easy to miss in an associated application in that it can take a whole lot of clicking around before you even realize it is there. But all metadata is equally apparent in the binary format of a file. Similarly, a file can have a recursively embedded document inside of it where only a few lines of the embedded document may be visible by default. But the whole embedded file is easily accessible in a binary format view. A search engine can also handle a multilevel nested file structure, like an email with a ZIP or RAR attachment containing a Word document with an Excel spreadsheet embedded inside.

About dtSearch. dtSearch has enterprise and developer products that run “on premises” or on cloud platforms to instantly search terabytes of “Office” files, PDFs, emails along with nested attachments, databases and online data. Because dtSearch can instantly search terabytes with over 25 different search features, many dtSearch customers are Fortune 100 companies and government agencies. But anyone with lots of data to search can download a fully-functional 30-day evaluation copy from dtSearch.com

 

RELATED: Kevin Price of the Price of Business show discusses the topic with Thede on a recent interview.

You Might Also Like...

  • An Ethical, Privacy-First Approach to Business Search Engines

    INTERVIEW ON THE PRICE OF BUSINESS SHOW, MEDIA PARTNER OF THIS SITE. Recently Kevin Price,…

  • To Get Your Business To Where It Should Be, You Need SEO

    You may have the best product or service available, but if nobody knows about it…

  • The COVID Crunch on Business

    INTERVIEW ON THE PRICE OF BUSINESS SHOW, MEDIA PARTNER OF THIS SITE. Recently Kevin Price,…

  • Explore how to use Enterprise Change Product Development

    Because a company’s products and services represent all its value-creating activities and naturally form critical…

  • Celebrating Gratitude for the Holidays on the Price of Business

    INTERVIEW ON THE PRICE OF BUSINESS SHOW, MEDIA PARTNER OF THIS SITE. Recently Kevin Price,…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

VIDEO: This Week’s Best of our Network

https://www.youtube.com/watch?v=taN65F63jbA

GDPR Compliance

USABR does not collect data on its visitors.  For more information visit: https://www.usabusinessradio.com/contact-us/

Contact

Contact articles@usabusinessradio.net for more information on articles on this site. BMuyco@usabusinessradio.net for all other information.

Recent Articles

  • An Unconventional Look on How Businesses Can Save Their Clients
  • A Step-by-Step Guide for a Business Owner Planning Their Exit
  • Rep. Haley Stevens Makes the Case for Accountability at HHS
  • The Search for a Field Sales Management Tool
  • How To Respond When a Contractor Goes Rogue: A Crisis Management Guide

Also in TTUSA

  • The Power of Persuasion in Forming a Government or Growing a Business
  • From Addressing The Nation’s Highest Demand On PPE’s Supplies To Stopping Human Trafficking: Here Is The Untold Story Of Athens Ramseyer
  • What Is Geofencing Mobile Advertising and How It Works
  • American Heart Association Celebrates Heart Month 2021
  • Lessons for the US About Progressives in Western Countries Failure to Protect Borders

RSS The Daily Blaze

  • Former Venezuelan Political Prisoner on What Comes Next for the Country
  • Why Maduro? News Media Coverage Is Weak
  • What Happens Next in Venezuela?
  • It Seems Wesley Hunt Is MIA in US Senate Race
  • Sizing Up CBS Censorship Efforts on 60 Minutes

RSS USA Business Radio

  • The Hidden Dangers of Allowing Non-Attorneys or Unregulated Entities To Hold Third-Party Funds
  • The Internal Business Risks That Many Owners Underestimate
  • AI Art Market: How Blockchain Proves Your Digital Masterpiece’s Authenticity and Value
  • The Power of Mentorship With Monnica Rose
  • Be Serious About Potential Risks Facing Your Company

RSS USA Daily Times

  • Gut Instincts: The Real Reason You Crave Sugar
  • International Bestselling Author on Her Latest Jewish Romance Novel
  • 5 Most Profitable Small Businesses in the UK for Fresh Graduates With Low Investment
  • Beyond Command: Lead With Flow & Momentum
  • Luxury Travel Within Reach

RSS USA Daily Chronicles.

  • Life After Ownership – Planning Your Purposeful Next Chapter
  • National Diabetes Month Spotlight
  • 10 Ethical ChatGPT Prompts for Answering Assignments Every Student Can Use (2025–26 Guide)
  • The Price of Pet Food
  • Part One: Rethinking Nutrition in America — a Conversation With Marion Nestle, Ph.D., M.P.H.

RSS Price of Business

RSS US Daily Review

  • The Case for Local Law Enforcement
  • The Silent Exposure to America’s Nuclear Weapons Technicians
  • Maduro in New York: From Indictment to Custody
  • Why Did Trump Reject Maduro’s Offer?
  • Unboxing Trump’s View of Affordability

PoB Digital Network

US Daily Review

USA Business Radio

USA Daily Chronicles

USA Daily Times

The Daily Blaze

The Times USA

Price of Business

Privacy Policy

https://www.thetimesusa.com/privacy-policy-2/

© 2026 The Times USA | Powered by Superbs Personal Blog theme