Skip to content
The Times USA
Menu
  • ABOUT
  • CONTACT
  • LIFESTYLE
  • NATIONAL NEWS
  • BUSINESS
  • INTERNATIONAL NEWS
  • TECHNOLOGY
  • PRICE OF BUSINESS SHOW AUDIOS
Menu

Digging for Digital Gold

Posted on June 15, 2022 by admin

By Elizabeth Thede, Special for The Times USA

 

The discovery of gold in 1848 in California spawned the Gold Rush of 1849. But finding gold today has less to do with wading through streams and more to do with effective text retrieval across terabytes of data. Here’s how to get started – no Western migration required.

The first step is to build an index across the data. An index is just an internal tool that allows the search engine to quickly sift through terabtyes, the digital equivalent of panning for gold. A single dtSearch index, for example, can hold up to a terabyte of text across one or more data repositories.

There are no limits on the number of search indexes that dtSearch can build and instantly search. In a concurrent-access environment like an Intranet or Internet site, search can run in a completely stateless manner, making it very easy to scale. Multiple concurrent search threads can operate with instant response time. Each user can then review search results with highlighted hits.

To start indexing, just point to the data you want to index and dtSearch does everything else. No need to even tell the software what types of data it is working with. The software will automatically recognize whether it is working with PDFs, emails, web-based formats, compressed data like ZIP or RAR, Microsoft Office files such as PowerPoint, Excel, Access, Word, OneNote, etc.

The final index contains a compilation of each unique word and number across the indexed data and its position in the data. For continually evolving data, dtSearch can update indexes automatically as often as you like using the Windows Task Scheduler. Updating an index simply adds new data, deletes old data and reindexes modified data. Index updates do not lock out searching, so instant search—even instant multithreaded concurrent searching—can continue unaffected.

Turning to search options, for the ‘49ers, the critical determination was binary: gold or not gold. Luckily, dtSearch has moved beyond a simple binary framework to provide over 25 different types of search options. You can search for words or phrases in any type of Boolean and/or/not configuration: gold nuggets and (Sacramento Valley or San Francisco) and not fool’s gold. Or you can add a proximity element, such as taking the entire previous search request and adding a requirement that gold nuggets appear within 34 words of silver bars, or say 17 words before silver bars.

Concept searching finds thesaurus or user-defined synonyms. Add on fuzzy searching adjustable from 0 to 10 to accommodate potential typographical errors whether relating to old document scans or current email misstypings. That way, even if San Francisco is mistyped San Franmisco, you can still pick that up with a low level of fuzziness. By default, dtSearch will search the full text and metadata of all items, or you can limit one or more elements of a search to specific metadata.

After a search, dtSearch will automatically rank search results for display with highlighted hits by so-called vector-spaced relevancy-ranking. With that type of default ranking, if you search for silver or gold, and silver appears millions of times in the indexed data but gold just a few times, gold mentions would rank more highly and items with the densest gold mentions would rank even higher. Or you can customize ranking “on the fly” at search time, like giving silver a negative weight of 7 and gold a positive weight of 8 regardless of the terms’ prevalence in indexed data. Or you can choose to adjust the weightings if one or both terms appears in certain metadata or near the top of a file, for example.

Beyond word and phrase queries, dtSearch also supports number and numeric range searches, letting you search for any number between 1948 and 2022. And dtSearch also supports date and date range searches, letting you search for any date between July 4, 1949 and April 17, 2023 even if the dates appear in different formats, like July 11, 1949 and 5/18/21. And dtSearch can even find credit cards in data, just to make sure that that a stray credit card number that paid for a prospecting pan isn’t still sitting out there.

But now suppose someone has found gold and is trying to actually hide that fact in text. Notably, a search engine like dtSearch can also search through many types of hidden data. Some examples:

  • Metadata may be “hidden” so it is really hard to spot looking at a file in its native application, requiring an extraordinary amount of clicking around to locate. But all metadata is right there in the binary format of each file, and hence “clear as day” to a search engine.

 

  • Embedding documents inside other documents will also not work to obscure text. dtSearch can seamlessly parse multilevel container files, like an email with a ZIP attachment including a OneNote file with a fully embedded PowerPoint.

 

  • Saving a file with a misleading file extension also won’t work as a means of hiding text. You can have an Access database saved with an Excel spreadsheet extension and dtSearch will still recognize the correct file format and search it appropriately.

 

  • Black text against a black background, white text against a white background or gold text against a gold background is just text for a search engine.

 

  • And dtSearch can even identify “image only” PDFs. These are files that look externally like normal PDFs, but really consist of just a straight-up picture, with no actual digital text. dtSearch can flag those at indexing time so you know you need to run them through an OCR application like Adobe Acrobat to make them full-text searchable.

About dtSearch. dtSearch has enterprise and developer products that run “on premises” or on cloud platforms to instantly search terabytes of “Office” files, PDFs, emails along with nested attachments, databases and online data. Because dtSearch can instantly search terabytes with over 25 different search features, many dtSearch customers are Fortune 100 companies and government agencies. But anyone with lots of data can download a fully-functional 30-day evaluation copy from dtSearch.com to find buried digital gold.

RELATED: Kevin Price of the Price of Business show discusses the topic with Thede on a recent interview.

You Might Also Like...

  • Digging into Data – Unburied Treasure

    By Elizabeth Thede, Special for The Times USA Previously, I addressed how a search index…

  • Digital 2020 Marketing

    SEO Expert Danny Star, CEO of Websites Depot Inc., a web agency in Los Angeles, has announced…

  • Beyond Boolean Search

    By Elizabeth Thede, Special for The Time USA   Many people have heard of Boolean…

  • Tippy's Digital Tipping Platform Receives Endorsement

    Tippy, the digital tipping platform for the salon and spa industry, has received an endorsement…

  • TTUSA and PoB Digital Network on the Best in Books for 2019

    The Times USA is a part of the Price of Business Digital Network. They celebrate…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Celebrating 25 Years of the Price of Business Show

https://www.youtube.com/watch?v=5ViFPGoK-ks

VIDEO: This Week’s Best of our Network

https://www.youtube.com/watch?v=4k2LKOjM7OU

GDPR Compliance

USABR does not collect data on its visitors.  For more information visit: https://www.usabusinessradio.com/contact-us/

Contact

Contact articles@usabusinessradio.net for more information on articles on this site. BMuyco@usabusinessradio.net for all other information.

Recent Articles

  • 2026 Estate Planning Strategies for Exiting Business Owners
  • How Private Group Tours Wrangell Handle Bathrooms, Snacks, and Cold Water
  • The Hidden Math Behind How Brands Ship Boxes at Scale
  • What Every CMO Needs to Know Before Commissioning AI Development
  • ADHD and Bipolar Disorder – Why the Overlap Matters More Than Most People Realize

Also in TTUSA

  • Amazon’s Influence on the Real Estate Market Most Likely to Overperform in 2019
  • 2019: An Economic Outlook
  • What the Fed’s Inflation Talk Means to Main Street
  • Sara Schulting Kranz Shares How To Thrive From Life’s Challenges and Traumas
  • The Economic Impact of the Trade Wars in 2019

RSS The Daily Blaze

  • Deep Dive on How To Navigate Investing Now
  • After Shunning “The Devil Wears Prada,” Vogue Is Embracing the Sequel. Why?
  • From Revolution to Modern Conflict: Breaking Down the Cuba “Takeover” Threats
  • Innovative Thinking in Foster Care: Changing the Paradigm
  • California Sober Is No Longer a Fringe Idea — And the Roads Will Feel It

RSS USA Business Radio

  • Production As Precedent: The It Ends With Us Legal Battle
  • Innovation Meets Regulation: The Science of Getting Devices Approved
  • Trusting Your Path Without Forcing the Outcome
  • Paddy Barr’s Commentary Feature on the Price of Business Digital Network
  • Three Things This Former Comptroller General of the US Is Watching

RSS USA Daily Times

  • Get Organized Day Is April 26. But if We Aren’t Organized Yet, What Are the Chances This Year Will Be Different?
  • Kwong v. United States: A New Legal Precedent for Taxpayers
  • Culture Scholar – Part Two: From Survival to Systems
  • Why Sugar Is So Hard To Quit
  • The Ides of March Is Fast Approaching; Take Heed of Any Warnings in Your Enterprise Data

RSS USA Daily Chronicles.

  • Reclaiming Every Dollar: The Pandemic-Era Interest Freeze
  • The Value Acceleration Journey: How Privately Held Businesses Intentionally Build Enterprise Value
  • Smart Food Choices To Prevent Diabetes
  • When Empathy Backfires: The Leadership Relational Trap
  • How To Make Doula Services Affordable

RSS Price of Business

  • Production As Precedent: The It Ends With Us Legal Battle
  • Deep Dive on How To Navigate Investing Now
  • Dancing Through the New Work World
  • After Shunning “The Devil Wears Prada,” Vogue Is Embracing the Sequel. Why?
  • Patexone Attracts More Investors As Positive Reviews Grow

RSS US Daily Review

  • One Year Into the Post-NAR Commission Market, Choice Home Warranty Is Showing Up in More Seller Listing Packages
  • How To Transform Your Life
  • The Signature of the Die: The Invisible Architecture of Everyday Objects
  • A Guide to Finding a Reliable Plumber in Portland Metro Oregon
  • Building Stronger Women, Stronger Communities: The Vision Behind WOVI

PoB Digital Network

US Daily Review

USA Business Radio

USA Daily Chronicles

USA Daily Times

The Daily Blaze

The Times USA

Price of Business

Privacy Policy

https://www.thetimesusa.com/privacy-policy-2/

© 2026 The Times USA | Powered by Superbs Personal Blog theme