ChatGPT解决这个技术问题 Extra ChatGPT

How do I upload a file with metadata using a REST web service?

I have a REST web service that currently exposes this URL:

http://server/data/media

where users can POST the following JSON:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

in order to create a new Media metadata.

Now I need the ability to upload a file at the same time as the media metadata. What's the best way of going about this? I could introduce a new property called file and base64 encode the file, but I was wondering if there was a better way.

There's also using multipart/form-data like what a HTML form would send over, but I'm using a REST web service and I want to stick to using JSON if at all possible.

Sticking to using only JSON is not really required to have a RESTful web service. REST is basically just anything that follows the main principles of the HTTP methods and some other (arguably non-standardised) rules.

D
Darrel Miller

I agree with Greg that a two phase approach is a reasonable solution, however I would do it the other way around. I would do:

POST http://server/data/media
body:
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

To create the metadata entry and return a response like:

201 Created
Location: http://server/data/media/21323
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentUrl": "http://server/data/media/21323/content"
}

The client can then use this ContentUrl and do a PUT with the file data.

The nice thing about this approach is when your server starts get weighed down with immense volumes of data, the url that you return can just point to some other server with more space/capacity. Or you could implement some kind of round robin approach if bandwidth is an issue.


One advantage to sending the content first is that by the time the metadata exists, the content is already present. Ultimately the right answer depends on the organisation of the data in the system.
Thanks, I marked this as the correct answer because this is what I wanted to do. Unfortunately, due to a weird business rule, we have to allow the upload to occur in any order (metadata first or file first). I was wondering if there was a way to combine the two in order to save the headache of dealing with both situations.
@Daniel If you POST the data file first, then you can take the URL returned in Location and add it to the ContentUrl attribute in the metadata. That way, when the server receives the metadata, if a ContentUrl exists then it already knows where the file is. If there is no ContentUrl, then it knows that it should create one.
if you was to do the POST first, would you post to the same URL? (/server/data/media) or would you create another entry point for file-first uploads?
@Faraway What if the metadata included the number of "likes" of an image? Would you treat it as a single resource then? Or more obviously, are you suggesting that if I wanted to edit the description of an image, I would need to re-upload the image? There are many cases where multi-part forms are the right solution. It is just not always the case.
E
Erik Kaplun

Just because you're not wrapping the entire request body in JSON, doesn't meant it's not RESTful to use multipart/form-data to post both the JSON and the file(s) in a single request:

curl -F "metadata=<metadata.json" -F "file=@my-file.tar.gz" http://example.com/add-file

on the server side:

class AddFileResource(Resource):
    def render_POST(self, request):
        metadata = json.loads(request.args['metadata'][0])
        file_body = request.args['file'][0]
        ...

to upload multiple files, it's possible to either use separate "form fields" for each:

curl -F "metadata=<metadata.json" -F "file1=@some-file.tar.gz" -F "file2=@some-other-file.tar.gz" http://example.com/add-file

...in which case the server code will have request.args['file1'][0] and request.args['file2'][0]

or reuse the same one for many:

curl -F "metadata=<metadata.json" -F "files=@some-file.tar.gz" -F "files=@some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will simply be a list of length 2.

or pass multiple files through a single field:

curl -F "metadata=<metadata.json" -F "files=@some-file.tar.gz,some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will be a string containing all the files, which you'll have to parse yourself — not sure how to do it, but I'm sure it's not difficult, or better just use the previous approaches.

The difference between @ and < is that @ causes the file to get attached as a file upload, whereas < attaches the contents of the file as a text field.

P.S. Just because I'm using curl as a way to generate the POST requests doesn't mean the exact same HTTP requests couldn't be sent from a programming language such as Python or using any sufficiently capable tool.


I had been wondering about this approach myself, and why I hadn't seen anyone else put it forth yet. I agree, seems perfectly RESTful to me.
YES! This is very practical approach, and it isn't any less RESTful than using "application/json" as a content type for the whole request.
..but that's only possible if you have the data in a .json file and upload it, which is not the case
@mjolnic your comment is irrelevant: the cURL examples are just, well, examples; the answer explicitly states that you can use anything to send off the request... also, what prevents you from just writing curl -f 'metadata={"foo": "bar"}'?
I'm using this approach because the accepted answer wouldn't work for the application I'm developing (the file cannot exist before the data and it adds unnecessary complexity to handle the case where the data is uploaded first and the file never uploads).
G
Greg Hewgill

One way to approach the problem is to make the upload a two phase process. First, you would upload the file itself using a POST, where the server returns some identifier back to the client (an identifier might be the SHA1 of the file contents). Then, a second request associates the metadata with the file data:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentID": "7a788f56fa49ae0ba5ebde780efe4d6a89b5db47"
}

Including the file data base64 encoded into the JSON request itself will increase the size of the data transferred by 33%. This may or may not be important depending on the overall size of the file.

Another approach might be to use a POST of the raw file data, but include any metadata in the HTTP request header. However, this falls a bit outside basic REST operations and may be more awkward for some HTTP client libraries.


You can use Ascii85 increasing just by 1/4.
Any reference on why base64 increases the size that much?
@jam01: Coincidentally, I just saw something yesterday which answers the space question well: What is the space overhead of Base64 encoding?
c
ccleve

I don't understand why, over the course of eight years, no one has posted the easy answer. Rather than encode the file as base64, encode the json as a string. Then just decode the json on the server side.

In Javascript:

let formData = new FormData();
formData.append("file", myfile);
formData.append("myjson", JSON.stringify(myJsonObject));

POST it using Content-Type: multipart/form-data

On the server side, retrieve the file normally, and retrieve the json as a string. Convert the string to an object, which is usually one line of code no matter what programming language you use.

(Yes, it works great. Doing it in one of my apps.)


I'm way more surprised that no one expanded on Mike's answer, cause that's exactly how multipart stuff should be used: each part has it's own mime-type and DRF's multipart parser, should dispatch accordingly. Perhaps it's hard to create this type of envelope on the client side. I really should investigate...
G
Greg Biles

I realize this is a very old question, but hopefully this will help someone else out as I came upon this post looking for the same thing. I had a similar issue, just that my metadata was a Guid and int. The solution is the same though. You can just make the needed metadata part of the URL.

POST accepting method in your "Controller" class:

public Task<HttpResponseMessage> PostFile(string name, float latitude, float longitude)
{
    //See http://stackoverflow.com/a/10327789/431906 for how to accept a file
    return null;
}

Then in whatever you're registering routes, WebApiConfig.Register(HttpConfiguration config) for me in this case.

config.Routes.MapHttpRoute(
    name: "FooController",
    routeTemplate: "api/{controller}/{name}/{latitude}/{longitude}",
    defaults: new { }
);

T
Tenaciousd93

If your file and its metadata creating one resource, its perfectly fine to upload them both in one request. Sample request would be :

POST https://target.com/myresources/resourcename HTTP/1.1

Accept: application/json

Content-Type: multipart/form-data; 

boundary=-----------------------------28947758029299

Host: target.com

-------------------------------28947758029299

Content-Disposition: form-data; name="application/json"

{"markers": [
        {
            "point":new GLatLng(40.266044,-74.718479), 
            "homeTeam":"Lawrence Library",
            "awayTeam":"LUGip",
            "markerImage":"images/red.png",
            "information": "Linux users group meets second Wednesday of each month.",
            "fixture":"Wednesday 7pm",
            "capacity":"",
            "previousScore":""
        },
        {
            "point":new GLatLng(40.211600,-74.695702),
            "homeTeam":"Hamilton Library",
            "awayTeam":"LUGip HW SIG",
            "markerImage":"images/white.png",
            "information": "Linux users can meet the first Tuesday of the month to work out harward and configuration issues.",
            "fixture":"Tuesday 7pm",
            "capacity":"",
            "tv":""
        },
        {
            "point":new GLatLng(40.294535,-74.682012),
            "homeTeam":"Applebees",
            "awayTeam":"After LUPip Mtg Spot",
            "markerImage":"images/newcastle.png",
            "information": "Some of us go there after the main LUGip meeting, drink brews, and talk.",
            "fixture":"Wednesday whenever",
            "capacity":"2 to 4 pints",
            "tv":""
        },
] }

-------------------------------28947758029299

Content-Disposition: form-data; name="name"; filename="myfilename.pdf"

Content-Type: application/octet-stream

%PDF-1.4
%
2 0 obj
<</Length 57/Filter/FlateDecode>>stream
x+r
26S00SI2P0Qn
F
!i\
)%!Y0i@.k
[
endstream
endobj
4 0 obj
<</Type/Page/MediaBox[0 0 595 842]/Resources<</Font<</F1 1 0 R>>>>/Contents 2 0 R/Parent 3 0 R>>
endobj
1 0 obj
<</Type/Font/Subtype/Type1/BaseFont/Helvetica/Encoding/WinAnsiEncoding>>
endobj
3 0 obj
<</Type/Pages/Count 1/Kids[4 0 R]>>
endobj
5 0 obj
<</Type/Catalog/Pages 3 0 R>>
endobj
6 0 obj
<</Producer(iTextSharp 5.5.11 2000-2017 iText Group NV \(AGPL-version\))/CreationDate(D:20170630120636+02'00')/ModDate(D:20170630120636+02'00')>>
endobj
xref
0 7
0000000000 65535 f 
0000000250 00000 n 
0000000015 00000 n 
0000000338 00000 n 
0000000138 00000 n 
0000000389 00000 n 
0000000434 00000 n 
trailer
<</Size 7/Root 5 0 R/Info 6 0 R/ID [<c7c34272c2e618698de73f4e1a65a1b5><c7c34272c2e618698de73f4e1a65a1b5>]>>
%iText-5.5.11
startxref
597
%%EOF

-------------------------------28947758029299--

W
Will59

To build on ccleve's answer, if you are using superagent / express / multer, on the front end side build your multipart request doing something like this:

superagent
    .post(url)
    .accept('application/json')
    .field('myVeryRelevantJsonData', JSON.stringify({ peep: 'Peep Peep!!!' }))
    .attach('myFile', file);

cf https://visionmedia.github.io/superagent/#multipart-requests.

On the express side, whatever was passed as field will end up in req.body after doing:

app.use(express.json({ limit: '3MB' }));

Your route would include something like this:

const multerMemStorage = multer.memoryStorage();
const multerUploadToMem = multer({
  storage: multerMemStorage,
  // Also specify fileFilter, limits...
});

router.post('/myUploads',
  multerUploadToMem.single('myFile'),
  async (req, res, next) => {
    // Find back myVeryRelevantJsonData :
    logger.verbose(`Uploaded req.body=${JSON.stringify(req.body)}`);

    // If your file is text:
    const newFileText = req.file.buffer.toString();
    logger.verbose(`Uploaded text=${newFileText}`);
    return next();
  },
  ...

One thing to keep in mind though is this note from the multer doc, concerning disk storage:

Note that req.body might not have been fully populated yet. It depends on the order that the client transmits fields and files to the server.

I guess this means it would be unreliable to, say, compute the target dir/filename based on json metadata passed along the file