In Marvel’s The Avengers, Iron Man is known for his quick, brash, and (mostly) effective tactics in battle. He strikes fast, and while he doesn’t always get the job done the best possible way, his methods are usually good enough in the moment. The Hulk, on the other hand, is a bit bulkier, but his brute strength packs a much bigger punch when he utilizes it to its full capacity.
The way these two superheroes differ in their approach (and their results) mirrors the difference between types of brand commerce data.
How, you may wonder? Well, the quality of your data depends on the method you use to acquire it. Whether it’s feed-based data or crawled data will impact what you can rely on it for and how scalable your data collection process will be.
At PriceSpider, we offer crawled data for all of our products: Where to Buy, Prowl, and Brand Monitor. This gives our customers a distinct advantage over brands that can only rely on feed-based data.
In this article, we’ll examine the epic battle between feed-based data and crawled data, explain some of the differences, and highlight how you can best use data to become a hero for your brand.
What is feed-based data and crawled data?
To understand the differences between feed-based data and crawled data, let’s start by defining our terms.
Feed-based data
Most minimum advertised price (MAP) monitoring software and other ecommerce tools use feed-based data to track information about your product pages.
Feed-based data relies on data feeds to periodically update information. There are many different kinds of data feeds, but in the context of ecommerce, we’re usually talking about product feeds. These product data feeds contain defined attributes like product page title, product name, product price, product images, and product description.
The data feed gathers these attributes for a range of product pages, updating the feed according to a predetermined frequency. It could be once a day, once a week, once a month, or whatever interval was deemed appropriate for the feed’s use case.
You can generally count on a data feed to provide you accurate information on each of the defined attributes at the time of the most recent feed update. But you won’t get any information beyond what has been defined, nor will you know whether the feed’s most recent data actually reflects the current live data on the product pages, especially if you don’t know when the feed gets updated.
Crawled data
Crawled data goes directly to the source at the time of the crawl request in order to provide near real-time updates on product information. Unlike feed-based data, which tells you what a page was like X hours or days ago, crawled data tells you what’s on a product page right now. It crawls the page like a customer would, and then it reports back to you with what it found.
Furthermore, crawled data is not limited to a predefined list of fields. A data crawl can pull whatever definable pieces of information you request from it, including search results, enhanced product page content, image order, and more.
However, because crawled data is able to target such precise information and provide results in real time, it can be less efficient than feed-based data when processing on a large scale.
Feed vs. crawl
The following table breaks down the key differences between feed-based data and crawled data when monitoring the digital shelf:
Feed | Crawl | |
Independence and reliability | Limited to retailers/PIM data feed/sources, and often dependent on human factor (data upload), contracts, and tech issues | Free from any constraints, impossible to be locked out of |
Quality (monitoring depth) | Limited and outdated data | Data gathered in the same way as customers would see while shopping |
Frequency | Restricted to the frequency of retailers and business rules with possible human errors and delays | Can crawl as often as needed |
Scope (monitoring width) | One size doesn’t fit all, data constrained by feed fields | Can crawl any aspect of the digital shelf—titles, descriptions, brand-specified details, price, content, availability, and ratings and reviews |
Retailer network (monitoring width) | Dependent on contracts and data feed setup | Can crawl any retailer, whether global or local, no dependencies |
Accuracy | Repurposed PIM/retailer data may not give needed answers or be applicable and comparable to strategic KPIs | Monitors and analyzes whatever data you need (flexibility in scope of monitoring) |
When to choose crawled vs. feed-based data
At this point, it may feel like there’s no good reason to choose feed-based data over crawled data. It’s true that crawled data is the superior option on these major points of comparison. Nonetheless, feed-based data does the job well enough for a number of applications. It persists because it’s relatively easy to set up, easy to process at scale, and fairly affordable. Crawled data, on the other hand, is more advanced, harder to process on a large scale, and more expensive.
So the question you need to ask is: Does my application justify the extra expense of crawled data? The answer to that question is going to vary on a case-by-case basis, but it typically comes down to whether or not you rely on accurate, up-to-date information.
Here are a few instances where crawled data can make all the difference.
Tracking stock availability
Brand Monitor, PriceSpider’s digital shelf analytics solution, uses crawled data to track your stock availability. Our store locator software, Where to Buy, also uses it to help you avoid driving traffic to out-of-stock pages.
When your products go out of stock, you need to know right away. Every moment your product page is up with no availability is a lost opportunity. Any ad revenue you’re spending on the product will be wasted by sending customers to a page they can’t order from. Those customers will be frustrated that they’re unable to make a purchase, and they may give up and go to a competitor instead.
You can’t wait a week for the feed to update before you learn that you’re out of stock. You need to know right away so you can redirect advertising, send customers to a different retailer that isn’t out of stock, and work on getting your product back in stock as soon as possible.
This principle is important for competitors’ product pages as well as your own. If you learn that a competitor’s product has just gone out of stock, that gives you a window of opportunity to advertise on your competitor’s branded keywords and direct their frustrated customers to your comparable offering.
Crawled data can keep you up to date with near real-time stock availability. But if you’re waiting on feed-based data, it could be days or even weeks before you learn about the problem (or opportunity).
Ensuring consistent product details
Brand Monitor uses crawled data for content compliance to give you greater visibility into how well sellers are following your brand guidelines.
You work hard to make sure your images and descriptions are accurate, informative, and helpful, ensuring they put your products in the best light to help customers make a purchase decision. But resellers don’t always stick to your product details, either by accidentally or intentionally changing what you’ve prepared, or by failing to update product listings with new details as they come out.
When this happens, it can lead to confusion and frustration, losing sales from some customers and gaining sales from others under false pretenses, only to have them return your product because it didn’t match what was advertised.
The quicker you can catch these discrepancies, the quicker you can step in to ensure they are fixed. You don’t want to wait a week with misleading product information being promoted.
Spotting MAP violations
Although many MAP monitoring solutions rely on feeds for MAP enforcement, PriceSpider’s Prowl offers crawled data to give you the most accurate and timely results.
The prices you set for your products communicate a lot about them. Whether you’re aiming for premium pricing, economy pricing, or somewhere in between, having the right price is a crucial way to influence associations with your products. If resellers set your prices too low, it will not only cut into profit margins, but it can also harm your brand’s reputation. And when one seller undercuts your established prices, others tend to follow.
So it’s important to have a MAP policy in place, and it’s equally important to actually enforce it. But in order to enforce a MAP policy, you need up-to-date insights into what resellers are listing your products for. Feed-based data can help a little, but it will be on a delay. And a lot of damage can be done if sellers start a “race to the bottom” during the week you’re waiting for a feed to update.
Furthermore, feed-based data is vulnerable to gaming by particularly crafty sellers. If they determine what time your feed updates, then they can set their prices to meet your requirements during that window, while setting it far lower between feed updates. You could go months or years without your feed-based data ever alerting you to the problem.
Gain access to the best crawlers on the market
For over 20 years, PriceSpider has been building tools to help brands earn more sales, create better experiences, and discover deeper insights into their performance.
Where to Buy provides a seamless shopping experience for your customers. Using crawled data, we make sure they’re directed to retailers that have your products in stock. Prowl enforces your MAP policy using crawled data to keep track of your resellers in real time, ensuring they don’t undercut your prices. And Brand Monitor is the most advanced digital shelf analytics software available, using the best crawlers on the market to track your product pages (and those of competitors) across different retailers and resellers.
Our crawlers constantly check your page content, price, images, stock availability alerts, and more, alerting you the moment you need to take action.