FultonHistory.com Research Guide

To follow along with this guide, you will need to understand Boolean searches. For a primer, see the help page on FultonHistory.com.

Contents

  1. Run a Boolean Search
  2. Understand File Names
  3. Understand Preview Text
  4. View a Page
  5. Download a Page
  6. Remove Highlighting
  7. Find the Front Page
  8. Make Your Composite Image With the Newspaper Masthead

 

Run a Boolean Search

The following message came up in one of my genealogy groups. It seemed like an excellent candidate for a Fulton History search because the people lived in New York State in the nineteenth century, and the names didn’t seem prohibitively common.

I’m looking for confirmation that my 4th Great Grandfather worked to build the Erie Canal. Is there a list of workers somewhere?

His hame was John Calvin Batchelor (used ‘Calvin’), and there are some easy misspellings (Bachelor, Batchellor, Batcheler, Batchelder, etc).

I believe he was born in 1788 and was in the NY Militia in 1814, and left the Militia to work as a mason (stone cutter, brick layer, etc) on the Erie Canal around that time.

Any ideas? Thank you in advance 🙂

Copy the text from the message above so that you get all the alternate spellings correct. Then edit it, adding the proper syntax to convert the question into the Boolean search below.

Calvin w/2 (Batchelor or Bachelor or Batchellor or Batcheler or Batchelder)

This will tell the server to find all pages in which the word “Calvin” occurs within two words of any of the following words: “Batchelor”, “Bachelor”, “Batchellor”, “Batcheler”, “Batchelder”.

Go to https://fultonhistory.com/Fulton.html. In the box under “Items per page”, select “Boolean”.

 

Copy the Boolean search you made above, paste it into the Search box, and click the “Search” button.

 

Scroll down and look at the results. Note that the page displays the query you entered, along with the number of results. You should get 111 results.

 

Understand File Names

Scroll down farther to see the list of preview results in the left pane. Stop. Take the time now to understand the structure of the file name. This will save you a great deal of confusion later.

 

All the files are Adobe Acrobat format, so every file name ends with “.pdf”. Aside from that, there are three parts. Look closely at the link in the first result.

FultonHistoryInstructions_04a - Sidebar link detail

Name of newspaper
There is one aspect of this name that is consistent: the state abbreviation. So, if my experience holds true, you can rest assured that, if a newspaper is published in New York State, “NY” will appear in the file name.

Other than the state abbreviation, do not expect the file name to contain the name of the region where the newspaper was published. The name of the city may be misspelled. The newspaper may be named after the county or after an adjacent town. Unless you already have a comprehensive knowledge of exactly what newspapers were published in the region you want to search, do not attempt to restrict by region. You will miss plenty of potentially important sources.

I will say this repeatedly throughout this guide: When composing your searches, err on the side of less specificity. A few thousand search results may seem overwhelming to you now, but trust me: a few months from now you will find yourself broadening your search and taking the time to eyeball those thousands of results. If you slog through them now, you’ll save time in the long run.

Date range
This reflects the year, or a range of years, in which the newspaper was published. It may contain more specific information, such as a range of months. It is often inaccurate. The year may be incorrect. The year may say 1856, but in fact that “year” ran from September 1856 through August 1857. The range may span multiple years, e.g. 1850-1855. Remember: When composing your searches, err on the side of less specificity. Be overly generous with your date range. If you’re looking for something that happened in 1856, enter a Boolean date range expression such as 1855~~1860. This will radically reduce your chances of missing important results.

Page sequence
This is a number reflecting the order in which the images were scanned. This example has a sequence of 4447, which means that this page was somewhere around 4,447th in the scanning sequence. All the other pages in that batch of newspaper scans have the same file name, except for that number. The next page will have a file name that differs in only one respect: it will probably have the sequence 4448. The previous page will probably have the sequence 4446. If it’s a front page, the last page of the previous day’s issue will probably be 4446.

Note how I used the word “probably” above. That’s because scanning methods were wildly inconsistent. There are gaps. There are missing pages. In many cases, the person doing the scanning went from the back of the newspaper to the front, so 4448 may well be the previous page, rather than the next! Remember this. It will be important later.

Now look closely at the second result. See how the numbering sequence includes parentheses? It’s crucial to remember this for later, when you’re trying to navigate from page to page.

FultonHistoryInstructions_04b - parens

 

Understand Preview Text

Moving on to the preview text, spend some time looking at it. Consider what it contains, and what it doesn’t.

When the scanning software reads a page, it makes its best guesses as to how to cluster the text. It often reads text from adjacent columns as one continuous block. So the selected blocks of text you see in that preview may be a patchwork of unrelated text from separate articles. This is a crucial point to remember, because even if two words appear in the same sentence, if there’s a line break between them, the pdf may “think” the top line continues in the adjacent column. As a result, the Boolean search might think those two words are thirty words apart instead of two.

Note how garbled the text is. It’s extremely rare to see even a short sentence with all the words scanned correctly. The scanning software has to guess at blurry third-generation microfilm copies, so the vast majority of the results are hash. Remember that that hash is what the software is searching. So, depending on the complexity of your search, it may miss five or ten results for every one it finds.

There are two morals to this story.

  1. Never stop thinking of different angles from which to approach your search. The name you’re looking for may be right there on the page, but because the OCR read the “m” as “rn”, you’ll never find that page with any search that depends on that name being there. Often I’ve found important articles about a particular person while searching for the name of their sibling, simply because the OCR happened to read only the sibling’s name correctly.
  2. Sometimes the text is so garbled that no search will ever find that page. Therefore, the only way you will ever find that page is with good old Mark I Eyeball. In other words, you go to a specific newspaper in a specific date range and start reading, using the techniques in section xxx.

 

View a Page

Now click on the top link. Note that it changes color to help you remember that you already clicked it. The page in question appears in the viewing pane.

 

You can scroll using the arrows on your keyboard or the scroll bars on the right and bottom of the viewing pane. The scroll bars won’t be visible unless you’re able to scroll in that direction. In this example, the bottom of the page is hidden from view, so the scroll bar on the right is visible, thus allowing you to scroll down.

 

Hover your mouse pointer near the bottom right of the page, and the zoom controls will appear. Use the + and – buttons to zoom in and out. Note how the words you were searching for are highlighted.

 

Download a Page

Hover your mouse pointer near the top right and several buttons will appear. Click the Download button.

 

The “Save as” dialog will appear. Note that the name of the file is highlight-for-xml.pdf. That is the name of any pdf rendered by the highlighting software.

 

Save and open the pdf. Note that the pdf file you downloaded has the same highlighting as the page in the viewer. If you are OK with that, then you won’t want to go to the effort of removing the highlighting, so you may skip section xxx.

 

Remove Highlighting

Right-click (click the right mouse button) the link and select “Open link in new tab”.

 

Note that a new tab appears in your browser.

 

Click the new tab. Note that you now see the same page, with the same highlighting, but now you have the location of the pdf file in the URL bar.

 

Copy the entire URL and paste it into a text editor such as Notepad.

 

This URL contains a lot of XML syntax after the file location. Your goal is to strip away all that extra text so that only the file location remains.

Starting at the beginning of the text, search for the text “.pdf”. Sometimes a directory name contains “.pdf”, so you are not necessarily looking for the first occurrence. You’re looking for the first occurrence of “.pdf” that is followed immediately by an XML reference.

 

Once you’ve found the first occurrence of “.pdf” that is followed immediately by an XML reference, select everything after “.pdf”.

Delete all that extra text. Now you have a URL pointing to the pdf file for your page, without any XML highlighting syntax.

 

Select your “clean” URL. Go back to your browser and delete all the text from the URL bar. Paste your clean URL into the URL bar and press Enter.

 

You now see the same page without the highlighting. Click the Download button.

 

Note that now the “Save As” dialog contains the same file name that you originally saw in the link. That’s because you’re saving the “bare” file, without any highlighting.

 

Now you have the “bare” pdf file, with no highlighting.

 

Find the Front Page

 

For the next example, let’s use the first article I found that pertained to the original poster’s request.

Scroll down until you see results from the “Onondaga Hollow NY Onondaga Register 1814-1819”. Note that there are two entries that look like the same story, but the text is garbled in different ways. Their sequences are 0335 and 0337. This means they are probably repeat scans of the same page. This happens a lot. Click each to confirm what I found: that there is no significant difference between the two scans.

FultonHistoryInstructions_22 - Wedding example - preview

 

Right-click on the link for “Onondaga Hollow NY Onondaga Register 1814-1819 sp – 0335.pdf” and select “Open link in new tab.”

FultonHistoryInstructions_23 - Wedding example with viewing pane- preview

 

Click on the new tab. Note the highlighting. This is what we want to remove. Click in the URL bar. Select the entire URL and copy it.

FultonHistoryInstructions_24 - Wedding example - new tab, highlighting, copy URL

 

Go to your text editor and paste the URL into it.

FultonHistoryInstructions_25 - Wedding example - Paste URL into Notepad

 

Starting at the beginning of the text file, search for “.pdf”. Find the first occurrence of “.pdf” with XML syntax directly following it.

FultonHistoryInstructions_26 - Wedding example - Find beginning of XML

 

Select all the text after “.pdf”.

FultonHistoryInstructions_27 - Wedding example - Select XML

 

Delete all the text after “.pdf”. Copy the remaining “bare” URL.

FultonHistoryInstructions_28 - Wedding example - Delete XML

 

Go back to your browser. Select all the text in the URL bar and delete it. Paste the “bare” URL into the URL bar and press Enter.

FultonHistoryInstructions_29 - Wedding example - Paste clean URL back into tab

 

You are now viewing page with no highlighting. Click the Download link.

FultonHistoryInstructions_30 - Wedding example - Highlingting is gone, so download

 

Note that the suggested file name now matches the file name in the link. I suggest adding the date and some reminder of the content to the file name.

FultonHistoryInstructions_31 - Wedding example - Save clean pdf

 

Open the file in Acrobat Viewer. Zoom in to the article you want. Press the “Print Screen” key to get a screenshot.

FultonHistoryInstructions_32 - Wedding example - View downloaded pdf

 

Open a new Paint file and paste the screenshot image into it.

FultonHistoryInstructions_33 - Wedding example - Paste into paint

 

Crop the image as desired.

FultonHistoryInstructions_34 - Wedding example - Drag and crop image

 

Click File->Save as.

FultonHistoryInstructions_35 - Wedding example - File - Save As

 

Save the file. Again, I suggest adopting a naming system that includes the date and summary text.

FultonHistoryInstructions_36 - Wedding example - Save as png, appropriate name

You now have an image of the article, and you know the approximate date based on the file name. If this is enough for you, skip section xxx.

 

Find the Front Page

This example illustrates a common problem. The page has no header information. You don’t know the publishing date, and since the year in the file name is often wrong, you don’t even know that for sure. You don’t know the page number, and you have no confirmation of the exact name of the newspaper.

FultonHistoryInstructions_37 - No date, no page, no newspaper name

 

Because of all that uncertainty, and because I like having the masthead in the image, I take the time to find the front page. It may seem like a lot of work, but after you’ve done it a few times, it only takes a few seconds.

Click the URL. Again, this is the bare URL that you got from stripping away the XML syntax, so it ends with “.pdf”. The number directly before that should be the page sequence. Put the cursor between the page sequence and the “.”.

FultonHistoryInstructions_38 - Go to last numeric digit in URL

 

Subtract one from the page sequence and replace it with the new number. In this case we go from 0335 to 0334, so just replace the 5 with a 4. Press Enter.

FultonHistoryInstructions_39 - Decrement numeric digit in URL

 

Now we’re getting somewhere. We still don’t know the page number, but now we have a typical page-two masthead. Given the year, it’s unlikely that this paper has more than four pages. The last page is usually all advertisements. Page three is usually local interest. So this is probably page two. So if you’ve gotten all the information you need, skip ahead. But if you want the front page masthead, edit the URL again.

FultonHistoryInstructions_40 - This page shows the name and date, but it's not the front

 

Change 0334 to 0333 and press Enter.

FultonHistoryInstructions_41 - Decrement number again

 

…and here we are at the front page!

FultonHistoryInstructions_42 - Now we have front page

 

Use the zoom controls to adjust the size of the image on the screen. I like to shrink down the masthead so that it’s just big enough to be readable. Sometimes, when the date is in a small font, that means making it much wider than the column. But in this case, it’s still readable after being shrunk down significantly.

FultonHistoryInstructions_43 - use zoom controls, get screenshot

 

Press “Print Screen” and paste the screenshot image into a new Paint file.

FultonHistoryInstructions_44 - Paste screenshot into Paint

 

Crop the image to your satisfaction.

FultonHistoryInstructions_45 - Position and crop

 

Click File->Image Properties.

FultonHistoryInstructions_46 - File Properties

 

Change the file height to something large, like 5,000 pixels. I do this because when I paste the article image in, I want room to drag it down below the masthead image without it getting chopped off at the bottom.

FultonHistoryInstructions_47 - Image Properties dialog

 

FultonHistoryInstructions_48 - Image Properties - change height

 

Go back to the Paint file with the image of the article. Copy the image. Come back to the Paint file with the masthead image. Paste the image of the article and arrange it below the masthead.

FultonHistoryInstructions_49 - Paste article and position below masthead image

 

Center the masthead relative to the article, or vice-versa, depending on which is wider. Then adjust the image height again so that it’s trimmed to the bottom of the article.

FultonHistoryInstructions_50 - Reposition masthead if necessary, adjust height

Congratulations! You have an image of your article with the front page masthead. No one who sees this image will ever have to wonder where it came from. Again, this may seem like a lot of work, but once you’ve done it a few times, the whole process takes a few minutes.