Skip to content

There is no way to specify an upper bound for the chunk size. #379

@sumerman

Description

@sumerman

When hackney consumes a chunked response, it always waits until an entire chunk is available before sending it to a client process even when used asynchronously. I do understand why it might be desirable in many cases, yet sometimes, it might be either absurd or even a security threat.

Consider a service that produces large chunked responses behind Nginx configured as in the following snippet:

worker_processes  1;

events {
    worker_connections  1024;
}

http {
    default_type  application/octet-stream;

    sendfile        on;
    keepalive_timeout  65;
    proxy_cache_path /tmp/nginx/mycache levels=1:2 keys_zone=mycache:100m inactive=10m max_size=3g use_temp_path=off;

    server {
        listen       8080;
        server_name  localhost;

	proxy_set_header Host $http_host;
	proxy_read_timeout 900s;

	proxy_cache_lock on;
	proxy_cache_lock_timeout 300s;

	proxy_cache_valid any 10s;

	add_header X-Cached $upstream_cache_status;

        location / {
            root   html;
            index  index.html index.htm;
        }

        location /test {
	    proxy_cache  mycache;
            proxy_pass   http://127.0.0.1:8090/;
        }

    }
}

For cache misses, chunks in a response might get merged, but for the hits, Nginx responds with a single mega-chunk containing the entire response (I checked that with nginx/1.10.2).

The described behaviour makes no sense for proxy-like workloads: why would anyone want to accumulate 1GB of chunked data before proxying a single byte? To make matters worse — it might be exploited by a malicious upstream service to put an app that uses hackney out of memory. Finally, together with #378 it makes for the deadly duet.

I suggest introducing an option for an upper bound for a chunk size. An available part of a chunk must be sent to a calling process if it's bigger than allowed by the upper bound, effectively splitting the chunk in two. I also suggest picking a reasonable default wich is not an infinity.

For now, we solved the issue by switching to ibrowse that allows specifying what behaviour is desired.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions