--- zzzz-none-000/linux-3.10.107/Documentation/block/biodoc.txt 2017-06-27 09:49:32.000000000 +0000 +++ scorpion-7490-727/linux-3.10.107/Documentation/block/biodoc.txt 2021-02-04 17:41:59.000000000 +0000 @@ -48,8 +48,7 @@ - Highmem I/O support - I/O scheduler modularization 1.2 Tuning based on high level requirements/capabilities - 1.2.1 I/O Barriers - 1.2.2 Request Priority/Latency + 1.2.1 Request Priority/Latency 1.3 Direct access/bypass to lower layers for diagnostics and special device operations 1.3.1 Pre-built commands @@ -255,29 +254,12 @@ What kind of support exists at the generic block layer for this ? The flags and rw fields in the bio structure can be used for some tuning -from above e.g indicating that an i/o is just a readahead request, or for -marking barrier requests (discussed next), or priority settings (currently -unused). As far as user applications are concerned they would need an -additional mechanism either via open flags or ioctls, or some other upper -level mechanism to communicate such settings to block. - -1.2.1 I/O Barriers - -There is a way to enforce strict ordering for i/os through barriers. -All requests before a barrier point must be serviced before the barrier -request and any other requests arriving after the barrier will not be -serviced until after the barrier has completed. This is useful for higher -level control on write ordering, e.g flushing a log of committed updates -to disk before the corresponding updates themselves. - -A flag in the bio structure, BIO_BARRIER is used to identify a barrier i/o. -The generic i/o scheduler would make sure that it places the barrier request and -all other requests coming after it after all the previous requests in the -queue. Barriers may be implemented in different ways depending on the -driver. For more details regarding I/O barriers, please read barrier.txt -in this directory. +from above e.g indicating that an i/o is just a readahead request, or priority +settings (currently unused). As far as user applications are concerned they +would need an additional mechanism either via open flags or ioctls, or some +other upper level mechanism to communicate such settings to block. -1.2.2 Request Priority/Latency +1.2.1 Request Priority/Latency Todo/Under discussion: Arjan's proposed request priority scheme allows higher levels some broad @@ -447,14 +429,13 @@ * main unit of I/O for the block layer and lower layers (ie drivers) */ struct bio { - sector_t bi_sector; struct bio *bi_next; /* request queue link */ struct block_device *bi_bdev; /* target device */ unsigned long bi_flags; /* status, command, etc */ unsigned long bi_rw; /* low bits: r/w, high: priority */ unsigned int bi_vcnt; /* how may bio_vec's */ - unsigned int bi_idx; /* current index into bio_vec array */ + struct bvec_iter bi_iter; /* current index into bio_vec array */ unsigned int bi_size; /* total size in bytes */ unsigned short bi_phys_segments; /* segments after physaddr coalesce*/ @@ -480,7 +461,7 @@ - Code that traverses the req list can find all the segments of a bio by using rq_for_each_segment. This handles the fact that a request has multiple bios, each of which can have multiple segments. -- Drivers which can't process a large bio in one shot can use the bi_idx +- Drivers which can't process a large bio in one shot can use the bi_iter field to keep track of the next bio_vec entry to process. (e.g a 1MB bio_vec needs to be handled in max 128kB chunks for IDE) [TBD: Should preferably also have a bi_voffset and bi_vlen to avoid modifying @@ -589,7 +570,7 @@ nr_sectors and current_nr_sectors fields (based on the corresponding hard_xxx values and the number of bytes transferred) and updates it on every transfer that invokes end_that_request_first. It does the same for the -buffer, bio, bio->bi_idx fields too. +buffer, bio, bio->bi_iter fields too. The buffer field is just a virtual address mapping of the current segment of the i/o buffer in cases where the buffer resides in low-memory. For high @@ -828,10 +809,6 @@ that requests are restarted in the order they were queue. This may happen if the driver needs to use blk_queue_invalidate_tags(). -Tagging also defines a new request flag, REQ_QUEUED. This is set whenever -a request is currently tagged. You should not use this flag directly, -blk_rq_tagged(rq) is the portable way to do so. - 3.3 I/O Submission The routine submit_bio() is used to submit a single io. Higher level i/o @@ -911,8 +888,8 @@ to refer to both parts and I/O scheduler to specific I/O schedulers. Block layer implements generic dispatch queue in block/*.c. -The generic dispatch queue is responsible for properly ordering barrier -requests, requeueing, handling non-fs requests and all other subtleties. +The generic dispatch queue is responsible for requeueing, handling non-fs +requests and all other subtleties. Specific I/O schedulers are responsible for ordering normal filesystem requests. They can also choose to delay certain requests to improve @@ -947,7 +924,11 @@ request safely. The io scheduler may still want to stop a merge at this point if it results in some sort of conflict internally, - this hook allows it to do that. + this hook allows it to do that. Note however + that two *requests* can still be merged at later + time. Currently the io scheduler has no way to + prevent that. It can only learn about the fact + from elevator_merge_req_fn callback. elevator_dispatch_fn* fills the dispatch queue with ready requests. I/O schedulers are free to postpone requests by @@ -1128,7 +1109,7 @@ as specified. Now bh->b_end_io is replaced by bio->bi_end_io, but most of the time the -right thing to use is bio_endio(bio, uptodate) instead. +right thing to use is bio_endio(bio) instead. If the driver is dropping the io_request_lock from its request_fn strategy, then it just needs to replace that with q->queue_lock instead.