Converting HTML - Empty page / Authentication problems

As of version 8.3, the PDF Converter returns a detailed message in the generated PDF in case of authentication problems, or other connectivity related errors. Please consider upgrading to that version if you are experiencing issues when converting HTML to PDF.

 

When you get a blank PDF, or a PDF containing an authentication related error, when carrying out an HTML conversion then the most likely problem is that the account the Muhimbi Conversion Service runs under has no access to the page you are trying to convert. The easiest way to troubleshoot this is as follows:

  1. Launch our diagnostics tool from the Windows Start Menu.
  2. Navigate to the ‘HTML Conversion’ tab and click both ‘Convert’ buttons using the default settings. Each should return valid content.
  3. If both indeed return valid content then please change the URL to the URL of the content you are trying to convert. Please do not enter a username and password.
  4. If step #3 returns an empty PDF then look in your web server's (IIS) log file to see what the status code is.
 

When troubleshooting HTML Conversions, take the following into account:

  1. Windows Internet settings (a.k.a Internet Explorer settings), as configured for the account the Conversion Service is running under, are used during conversion. This includes security zones and other security related settings such as block lists, authentication and proxy settings.
     
  2. The actual conversion is carried out by the Conversion Service on the machine that is running this service. If the URL to convert is located on a different machine, or the IP address of the page to convert will cause the request to be routed 'off machine' then the request may go via various systems in your infrastructure, including (transparent) proxy servers, firewalls, authentication systems etc.
 
A common problem is that, even though the Conversion Service account has the appropriate privileges on the destination URL, Windows' Internet Settings have not been configured to automatically login. This can be solved as follows:
  1. Log in to the desktop of the server that runs the Conversion Service using the account the Conversion Service runs under.
  2. Launch Internet Explorer and navigate to Internet Options / Security.
  3. Select the zone for the domain of the URL that you are trying to convert. If the domain is not part of any zones then add it to the list of 'Intranet' or 'Trusted sites'.
  4. With the relevant zone selected click 'Custom Level' and scroll all the way to the bottom.
  5. Set 'Logon' to either 'Automatic logon with current user name and password' (to automatically authenticate using the account the conversion service runs under), or 'Prompt for user name and password' (to login using the credentials provided in the Diagnostics Tool or Workflow Actions).
 
Determining the active Zone for a URL can be surprisingly complicated. Microsoft's IEZoneAnalyzer is a particularly useful utility for troubleshooting Zone settings. 


As of version 7.0 the credentials used for converting HTML to PDF is not limited to just the Conversion Service account. The credentials can be overridden in the Conversion Service's config file. Please note that Internet Explorer needs to be set to "User Authentication / Prompt for Username & Password" for the Account the conversion service runs under, and for the Security Zone the web page is located in. 
  
<add key="HTMLConverterFullFidelity.URLUsername" value=""/>
<add key="HTMLConverterFullFidelity.URLPassword" value=""/>

 

Starting with version 8.3, the following configuration settings are useful where it comes to troubleshooting authentication issues / dealing with empty or partially displayed pages. Details about editing the configuration file can be found here. Please note that many of these changes can also be specified on a request by request basis using either our Workflow Actions or Web Services Interface.
 

  • <add key="HTMLConverterFullFidelity.HtmlRenderingEngine" value="WebKit"/>
    Specify which HTML conversion engine to use. WebKit generally produces better output, especially with SharePoint 2013 and later, but for some rare scenarios the IE (Internet Explorer) option may work better. 
     
  • <add key="HTMLConverterFullFidelity.ConversionDelay" value="0"/>
    Modern web pages, including SharePoint 2013 and later, as well as SharePoint Online, use complex methods for rendering pages, often involving JavaScript. Under normal circumstances the Muhimbi Converter will convert HTML to PDF the moment the page has finished loading. However, you may want to add some milliseconds (a value of 1000 is 1 second) to allow all JavaScript to finish executing.
     
  • <add key="HTMLConverterFullFidelity.MediaType" value="Print"/>
    HTML is a language that is primarily aimed at displaying information on screen, and not at print purposes. However, modern web pages - including SharePoint - use CSS to define a different look and feel for the screen and the printer. Providing the default 'WebKit' based HTML Converter is enabled, our system defaults to Print CSS, which in some cases may not generate the desired output. Specify the 'Screen' option to generate a PDF that resembles what is displayed on screen. (The IE based converter always uses the 'Screen' option).
     
  • <add key="HTMLConverterFullFidelity.AuthenticationMode" value="WebAuthentication"/>
    Out of the box, the Muhimbi PDF Converter attempts to authenticate using standard Web / HTTP / Windows authentication. However, in order to convert SharePoint Online pages, a different authentication type will need to be specified. Use the 'MSOAuthentication' option when converting SharePoint Online URLs.
     
  • <add key="HTMLConverterFullFidelity.WebKitProxyURL" value=""/>
    Both the WebKit and IE based HTML Converters honour the system Proxy settings for the account the Conversion Service runs under. However, the WebKit based converter also allows these settings to be specified in the config file. Please make sure the optional 'WebKitProxyUsername' and 'WebKitProxyPassword' are filled out as well.

  • <add key="HTMLConverterFullFidelity.ErrorReporting" value="Detail"/>
    In case of authentication or connectivity related problems when converting a URL to PDF, the generated PDF will contain a descriptive error message by default (e.g. 'Access Denied'). However, rather than using the default 'Detail' option, it is also possible to return either a blank PDF (use 'Blank'), or fail the entire conversion (use 'Exception').

 

Further details, specific to solving HTML to PDF formatting issues, can be found in this Knowledge Base article.

 

 As always, if you require assistance, please contact our friendly support desk. We are here to help.

 

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.