在Google App Engine (GAE) flex實體上運行一個nodejs服務器,我的客戶從我的應用程式中得到了間歇性的502錯誤。這些請求從來沒有擊中我的節點服務,但它們似乎與nginx的日志相吻合,與對等的連接重設有關:
[error] 34#34: *25817 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"。
[錯誤] 34#34: *27919 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"。
[錯誤] 34#34: *28746 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"。
[錯誤] 34#34: *28747 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"。
[錯誤] 34#34: *24022 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint"。
[錯誤] 34#34: *29214 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: , request: "POST /endpoint HTTP/1.1", upstream: "http://172.17.0.1:8080/endpoint".
什么原因會導致這種行為?CPU/記憶體負載并沒有接近資源極限,盡管它似乎在服務器處于某種負載下時更頻繁地發生。
uj5u.com熱心網友回復:
當部署到Google App Engine時,一個負載平衡器被放置在實體的前面。該負載平衡器的HTTP保持活力設定為600秒。
負載均衡器然后連接到實體上的nginx服務,該服務使用650秒的保持活力,它甚至在配置中有一個有用的注釋,說它需要更高,以防止出現競爭條件。
# GCLB使用10 minutes keep-alive timeout。設定它在這里多一點
#以避免兩個超時之間的競爭條件。
keepalive_timeout 650。
最后,nginx反向代理到你的node應用,該應用使用默認的keepalive... 5秒
這將導致超時之間的競爭條件(duh),你需要將你的節點服務器的超時設定為高于650秒。如果你使用的是expressjs,看起來像這樣:
const app = express() 。
const server = app.listen(process.env.PORT) 。
//nginx在GAE上使用650秒的保持連接超時。在這里將其設定為更多一點,以避免兩個超時之間的競爭條件。
server.keepAliveTimeout = 700000。
//確保headersTimeout的設定高于keepAliveTimeout,因為這個nodejs的回歸錯誤:https://github.com/nodejs/node/issues/27363。
server.headersTimeout = 701000。
你可以查看Analyze 'Connection reset' error in Nginx upstream with keep-alive enabled,了解上游服務器需要較大超時的技術(TCP級別)解釋。
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/317555.html
標籤:
