我的xslt的哪一部分導致該程序需要數小時？-有解無憂

下面的 XSLT 需要幾個小時才能運行。我運行的大多數 XSLT 需要幾秒鐘或幾分鐘。我究竟做錯了什么？目標是獲取 Word XHTML 并將其轉換為平面檔案以匯入名為 FLEx 的字典程式。這只是識別字典片段的一個步驟。我有一個 52K 的輸入 XHTML 檔案。我分 27 個步驟進行轉換。最初的那些是使用 Saxon 和 XSLT 完成的。最后的步驟是通過一個名為 CC 的特殊程式完成的，該程式早于 AWK 和 Pearl。它是一個非常高效的字串替換工具。在 CC 中處理檔案需要幾秒鐘。前 8 個步驟是 XSLT，并且需要永遠（超過 3 小時）來運行每個步驟。最后一個 XSLT 將檔案扁平化，因此它不再是 XML 格式。CC 處理文本檔案。

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE stylesheet [
<!ENTITY cr "&#xD;&#xA;">
<!ENTITY tab "&#9;">
<!ENTITY nbsp "&#xa0;">
]><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    xmlns="http://www.w3.org/1999/xhtml"
    xmlns:html="http://www.w3.org/1999/xhtml"
    version="3.0">
    
    
    <xsl:strip-space elements="*"/>
    <xsl:output  encoding="UTF-8" indent="yes"/>

    <xsl:template match="html:span[@class='source']" priority='4'>
       <xsl:element name="span">
           <xsl:attribute name="class">source</xsl:attribute>
        <xsl:value-of select="normalize-space(.)"/>
       </xsl:element>
    </xsl:template>
    
    <xsl:template match="html">
        <html>
            <xsl:apply-templates/>
        </html>
    </xsl:template>
    
    <xsl:template match="html:i" priority="1">
        <xsl:element name="span">
            <xsl:attribute name="class">2.3 italic</xsl:attribute>
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template>
    
    <xsl:template match="html:b">
        <xsl:choose>
            <xsl:when test="contains(.,'Derivation')">
         <xsl:element name="span">
             <xsl:attribute name="class">2.3dd derivation</xsl:attribute>
         </xsl:element>
            </xsl:when>           
            <xsl:when test="ancestor::*[contains(@class,'1.4 lx')]">
                <xsl:apply-templates/>
            </xsl:when>           
            <xsl:when test="ends-with(.,'-')">
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3a variant-none</xsl:attribute>
                    <xsl:value-of select="."/>
                </xsl:element></xsl:when>
            <xsl:when test="preceding::*[1]=preceding::html:br[1] and not(contains(.,'Forms')) and not(starts-with(following::text()[1],'(')[1])">
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 variant-space</xsl:attribute>
                    <xsl:apply-templates/> 
                    <!--          <xsl:value-of select="."/> joins that we don't want-->     
                </xsl:element></xsl:when>
            <xsl:when test="preceding::*[1]=preceding::html:br[1] and not(contains(.,'Forms'))" >
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 variant-none</xsl:attribute>
                    <xsl:value-of select="."/>     
                </xsl:element></xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates/>
           </xsl:otherwise>         
        </xsl:choose>
    </xsl:template>
    
    <xsl:template match="html:br">
        <xsl:choose>
            <xsl:when test="starts-with(following::text()[1],' (')">
                <xsl:call-template name="makeLineBreak"/>
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 text-(</xsl:attribute>
                </xsl:element>
          <xsl:apply-templates/>      
          </xsl:when>
            <xsl:when test="following::*[1]=following::html:span[@class='MsoHyperlink'][following::*[1]=following::html:b[1]]">
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:copy-of select="."/>
            </xsl:when>
            <xsl:when test="following::*[1]=following::html:span[@class='MsoHyperlink']">
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:copy-of select="."/>
            </xsl:when>
            <xsl:when test="following::*[1]=following::html:span[@class='Arial'][1]">
                <!--      2.3 -->  
            </xsl:when>
            <xsl:when test="starts-with(following::text()[1],'variant of')">
                <!--      2.3 variant of-->  
            </xsl:when>
            <xsl:when test="following::*[1]=following::html[b][1] and contains(following::html:b[1],'Forms')">
                <xsl:call-template name="makeLineBreak"/>
                <xsl:copy-of select="."/>
            </xsl:when>
            <xsl:when test="preceding::text()[1]=')'">
                <!--      2.3 -->  
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3b definition</xsl:attribute>
                    <xsl:call-template name="processBold"/> 
                </xsl:element>
            </xsl:when>
            <xsl:when test="starts-with(following::text()[1],'(')">
                <!--      2.3 -->  
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 gid</xsl:attribute>
                    <xsl:call-template name="processBold"/> 
                </xsl:element>
            </xsl:when>
            <xsl:when test="starts-with(following::text()[1],'(')">
                <!--      2.3 -->  
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 gid</xsl:attribute>
                    <xsl:call-template name="processBold"/> 
                </xsl:element>
            </xsl:when>
            <xsl:otherwise>
                <!--      2.3 -->  
                <xsl:call-template name="makeLineBreak"/>               
                <xsl:element name="span">
                    <xsl:attribute name="class">2.3 definition</xsl:attribute>
                    <xsl:call-template name="processBold"/> 
                </xsl:element>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    
    <xsl:template name="processBold">
            <xsl:choose> 
                <xsl:when test="self::html:b and preceding::*[1]=preceding::span[1][@class='vernacular']">
                    2.3 vernacular bold 
                    <xsl:text> </xsl:text> 
                </xsl:when> 
                <xsl:when test="self::html:b">
                    2.3 bold 
                    <xsl:apply-templates select="."/> 
                    <xsl:text> </xsl:text> 
                </xsl:when> 
                <xsl:when test="self::html:span[@class='Arial']"> 
                    <xsl:element name="span">
                        <xsl:attribute name="class">2.3a definition</xsl:attribute>
                        <xsl:value-of select="."/>
                    </xsl:element>          
                </xsl:when> 
                
                <xsl:otherwise> 
                </xsl:otherwise> 
            </xsl:choose>
        
    </xsl:template>
    
    <xsl:template match="html:span[@lang][parent::html:b]" priority="1">
      <xsl:choose>
          <xsl:when test="preceding::*[1]=preceding::html:br[1]">
              <xsl:element name="span">
                  <xsl:attribute name="class">2.3va variant-none</xsl:attribute>
                  <xsl:apply-templates/>
              </xsl:element>
          </xsl:when>
          <xsl:when test=".='/'">
              <xsl:value-of select="."/>
          </xsl:when>
          <xsl:when test="contains(.,'-')">
              <xsl:text>&nbsp;</xsl:text>
              <xsl:apply-templates/>
          </xsl:when>
    <xsl:otherwise>
  <xsl:apply-templates/>
    </xsl:otherwise>      
      </xsl:choose>
    </xsl:template>
    
    <xsl:template name="makeLineBreak">
        <xsl:text>
</xsl:text>      
    </xsl:template>
    
    <!-- identify transform -->  
    <xsl:template match="@*|*|processing-instruction()|comment()">
        <xsl:copy>
            <xsl:apply-templates select="*|@*|text()|processing-instruction()|comment()"/>
        </xsl:copy>
    </xsl:template>    
</xsl:stylesheet>

uj5u.com熱心網友回復：

你有很多表達，比如

test="following::*[1]=following::html:span[@class='MsoHyperlink'][following::*[1]=following::html:b[1]]">

原則上，評估following（或preceding）軸所花費的時間與檔案的大小成正比。但是，如果它后面跟著謂詞，[1]那么搜索會在遇到第一個跟隨節點時停止，這使得它在恒定時間（即時間與檔案大小無關）運行。你following在這個運算式中的三個呼叫屬于該規則；第四個 ( following::html:span[@class='MsoHyperlink']) 沒有。所以這個特定的測驗將花費與檔案大小成正比的時間。您正在為每個br元素評估一次此測驗，因此您評估它的次數大概與檔案大小成正比；這使得總成本為 O(n^2)。

很多時候，人們使用precedingand followingwhere preceding-siblingandfollowing-sibling會更合適。我不知道這里是不是這種情況。

我懷疑在大多數這些運算式中，您在應該使用“is”的地方使用了“=”。對具有大子樹的元素進行“=”測驗非常昂貴（至少與被比較的樹的大小成正比）。

您可以從查看代碼開始，尋找這些明顯的低效率，或者您可以從性能測量和結果分析開始。當面對大量代碼時，尤其是不熟悉的代碼時，第二種方法通常更有效率。首先獲取-TP:profile.html輸出以查看它是否識別出明顯的熱點。此外，當然，獲取 27 個步驟中每一個的時間安排，并決定重點關注其中的哪一個。

轉載請註明出處，本文鏈接：https://www.uj5u.com/net/346079.html

標籤：xml 表现 xslt

上一篇：有沒有辦法將Spacyen_core_web_sm應用于塊中的資料？

下一篇：用于創建新檔案夾的Azure資料工廠utcNow()動態函式