Logo
programming4us
programming4us
programming4us
programming4us
Home
programming4us
XP
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server
programming4us
Windows Phone
 
Windows Server

Sharepoint 2010 : Setting Up the Crawler - Crawling Other Document Types with iFilters

- Free product key for windows 10
- Free Product Key for Microsoft office 365
- Malwarebytes Premium 3.7.1 Serial Keys (LifeTime) 2019
3/4/2012 6:07:37 PM
Not all file types are crawled by SharePoint out of the box. Therefore, it is important to identify the file types that are important to the organization and make sure they are both crawled and searchable. It is probably not possible or desirable to crawl all file types found in an organization (especially those lingering on file shares); however, some thought should be given to which file types hold content relevant to the business's needs.

1. Adding a File Type to the Content Index

The first measure to take after identifying a potentially unsupported file type is to add it to the content index. This is done in the Central Administration under the Search service application. On the left menu is displayed the File Types menu item under the Crawling section (see Figure 1).

Figure 1. The Crawling menu on the Search Service Application page

The File Types page holds a list of all recognized file types for the SharePoint crawler (Figure 2). The most common files found in a SharePoint environment and all Microsoft Office file types are represented here. However, many file types common to most organizations, such as Portable Document Format (PDF) and Rich Text Format (RTF), are not added out of the box. Many other file types may also be found in organizations. Many are unique and complicated file formats. Others are just different names for plain text files. It would be a major undertaking for Microsoft to support even a fraction of them. Instead Microsoft has created a mechanism for adding new file types and converting them into something SharePoint's crawler can recognize.

To add a new file type, click the New File Type link at the top of the page. A new file type may already be a supported format, but the file extension might not be recognized by SharePoint. For example, there can be many variations of file name extensions for flat text files (e.g., .log). See Figure 3. Additionally, some file types will not appear by default but can be recognized and decoded by the default iFilters. If it is necessary to crawl these files, adding them is a simple but required task.

Figure 2. The File Types page

Figure 3. Adding the Log File Format (.log) file type for SharePoint to crawl

Some files will require the addition of an iFilter. An iFilter is a component that can decode a specific file type and allow the crawler to understand and store the text and metadata from it in its databases and index. Although many iFilters are provided for free from Microsoft and other sources, not all are installed by default by SharePoint 2010. Finding and installing these iFilters can be necessary to index certain file types. One of the most common file types found that is not supported by default in SharePoint is the Rich Text Format (RTF); another is the Portable Document Format (PDF).

To add the PDF format, it is recommended that you acquire an installable iFilter from Adobe or another third-party vendor. Other third-party vendors offer iFilters that have a larger range of compatibility with different PDF generation types and perform significantly better than Adobe's but come with a relatively modest price tag. Depending on what type of PDF file generator an organization uses and how many PDF documents it has, it may opt to use a third-party paid-for iFilter. 

1.1. Installing Adobe's iFilter

Acrobat iFilter can be acquired from Adobe's web site. It will be necessary to download and install the 64-bit version available from www.adobe.com/support/downloads/detail.jsp?ftpID=4025.The installation requires some additional installation steps. These steps are outlined in the PDF guide available on the download link on Adobe's site www.adobe.com/special/acrobat/configuring_pdf_ifilter_for_ms_sharepoint_2007.pdf). The guide is targeted for Microsoft Office SharePoint Server 2007, but the same instructions apply except for the location of the registry key. The basics of installing the 64-bit iFilter are as follows:

  1. Download the iFilter.

  2. Run the installer.

  3. Open the registry (Regedit.exe) and add the .pdf file extension value to the filter extension container (Figure 4) at \\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Setup\ContentIndexCommon\Filters\Extension. The default value that should be applied is {E8978DA6-047F-4E3D-9C78-CDBE46041603}.

  4. Add an icon to the iFilter install (see "Adding Icons to File Types").

  5. Restart the Search service application by running Services.msc at the run dialog, finding the SharePoint Server Search 14 service, and restarting it.

  6. Perform a full crawl.

    Figure 4. The new .pdf registry key in the registry

Other iFilter vendors' installation programs perform these tasks automatically. However, it is always necessary to perform a full crawl to retrieve new file types. Please follow the iFilter vendor's instructions when adding a new iFilter.

Before PDF documents or any new file type can be crawled, the Search service application will need to be restarted. The easiest way to do this is to go to the Services snap-in by typing "services.msc" in the search box in the Start menu on the server, find the SharePoint Server Search 14 service, and restart it (see Figure 5). After the service is restarted, it will be necessary to launch a full crawl to pick up any PDF files. For this reason, it is wise to install the PDF iFilter before starting the crawler on a large document set for the first time.

Figure 5. Restarting the Search service in the Services snap-in

NOTE

The Search service application can also be restarted using the NET START and NET STOP commands from a command prompt. The name of the Search service application is OSearch14.

1.2. Indexing Rich Text Format Files

Adding the Rich Text Format (RTF) file type requires finding the RTF iFilter on the server or on Microsoft's web site and installing and registering it on the SharePoint index server. Additionally, adding the RTF file type is necessary.

  1. Check if the RTF iFilter is on the server. It is called rtffilt.dll and is in the %windir%\system32 folder (probably C:\Windows\System32). If it isn't there, the self-extracting RTF iFilter file rtf.exe can be downloaded from Microsoft's web site (http://support.microsoft.com/kb/291676/en-us).

  2. Register the iFilter with the regsvr32 command at a command prompt by placing this line in the run dialog: regsvr32 rtffilt.dll.

  3. Add the RTF file type to the File Types page in Central Administration, as shown in Figure 6.

  4. Run Services.msc at the run dialog, find the SharePoint Server Search 14 service, and restart it.

  5. Start a full crawl.

    Figure 6. Adding the Rich Text Format (.rtf) file type
1.3. Adding or Changing the File Type Icon

New file types will usually not have an icon associated with them. SharePoint Search will display a default blank icon in such cases after the file type has been added. Even if there is a definition already there, many organizations will want to adjust the icons to match their own style requirements.

To add a new file type icon or change the existing one, first the new icon must be copied to the C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\IMAGES\ directory. Images can be downloaded or created. They should be icon size, roughly 16 pixels by 16 pixels. For PDF files, Adobe offers a free 17-pixel-by-17-pixel icon, with some legal restrictions, at www.adobe.com/misc/linking.html#pdficon. This will also work, but larger icons may cause formatting problems.

After the icon is added to the IMAGES directory, the DOCICON.xml file in the C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\TEMPLATE\XML directory needs to have a key added to tell SharePoint the name of the new file type's icon. For the PDF icon, the line should read <Mapping Key="pdf" Value="pdficon.gif"/>, where pdficon.gif is the name of the icon you saved to the directory.

NOTE

Since a full crawl will be needed to include new file types, it is advisable to add any iFilters and file types before a full crawl in a large production environment.

Other -----------------
- Sharepoint 2010 : Setting Up the Crawler - Defining Scopes
- Microsoft Dynamics CRM 4.0 Accelerators : Extended Sales Forecasting Accelerator (part 1) - CRM Reports
- Microsoft Dynamics CRM 4.0 Accelerators : Extended Sales Forecasting Accelerator (part 1) - CRM Customizations
- Microsoft Dynamics AX 2009 : Processing Business Tasks - Building a Display dimensions dialog
- Microsoft Dynamics AX 2009 : Processing Business Tasks - Creating electronic payment format
- Sharepoint 2007 : Change the Look of a Site Using Themes & Change the Home Page of a Site
- Sharepoint 2007 : Open the Site’s Settings Page & Change the Name, Description, Icon, or URL of a Site
- Microsoft Content Management Server : Preparing Postings for Search Indexing - Outputting META Tags
- Microsoft Content Management Server : Influencing Search Engines with the ROBOTS META Tag
- Windows Server 2003 : Recovering from System Failure (part 2) - Recovery Console
 
 
Top 10
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
 
programming4us
Windows Vista
programming4us
Windows 7
programming4us
Windows Azure
programming4us
Windows Server