在 Mongoose 中填充后查询

node.js mongodb mongoose

一般来说，我对 Mongoose 和 MongoDB 还是很陌生，所以我很难弄清楚这样的事情是否可行：

Item = new Schema({
    id: Schema.ObjectId,
    dateCreated: { type: Date, default: Date.now },
    title: { type: String, default: 'No Title' },
    description: { type: String, default: 'No Description' },
    tags: [ { type: Schema.ObjectId, ref: 'ItemTag' }]
});

ItemTag = new Schema({
    id: Schema.ObjectId,
    tagId: { type: Schema.ObjectId, ref: 'Tag' },
    tagName: { type: String }
});



var query = Models.Item.find({});

query
    .desc('dateCreated')
    .populate('tags')
    .where('tags.tagName').in(['funny', 'politics'])
    .run(function(err, docs){
       // docs is always empty
    });

有更好的方法吗？

编辑

对任何混淆表示歉意。我要做的是获取所有包含有趣标签或政治标签的项目。

编辑

没有 where 子句的文档：

[{ 
    _id: 4fe90264e5caa33f04000012,
    dislikes: 0,
    likes: 0,
    source: '/uploads/loldog.jpg',
    comments: [],
    tags: [{
        itemId: 4fe90264e5caa33f04000012,
        tagName: 'movies',
        tagId: 4fe64219007e20e644000007,
        _id: 4fe90270e5caa33f04000015,
        dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,
        rating: 0,
        dislikes: 0,
        likes: 0 
    },
    { 
        itemId: 4fe90264e5caa33f04000012,
        tagName: 'funny',
        tagId: 4fe64219007e20e644000002,
        _id: 4fe90270e5caa33f04000017,
        dateCreated: Tue, 26 Jun 2012 00:29:36 GMT,
        rating: 0,
        dislikes: 0,
        likes: 0 
    }],
    viewCount: 0,
    rating: 0,
    type: 'image',
    description: null,
    title: 'dogggg',
    dateCreated: Tue, 26 Jun 2012 00:29:24 GMT 
 }, ... ]

使用 where 子句，我得到一个空数组。

Neil Lunn

对于大于 3.2 的现代 MongoDB，在大多数情况下，您可以使用 $lookup 作为 .populate() 的替代品。这还具有实际执行“在服务器上”的连接的优点，而不是 .populate() 所做的实际上是 “多个查询”以“模拟”连接。

因此，就关系数据库的执行方式而言，.populate() 不是真正的“连接”。另一方面，$lookup 运算符实际上在服务器上完成工作，并且或多或少类似于 “LEFT JOIN”：

Item.aggregate(
  [
    { "$lookup": {
      "from": ItemTags.collection.name,
      "localField": "tags",
      "foreignField": "_id",
      "as": "tags"
    }},
    { "$unwind": "$tags" },
    { "$match": { "tags.tagName": { "$in": [ "funny", "politics" ] } } },
    { "$group": {
      "_id": "$_id",
      "dateCreated": { "$first": "$dateCreated" },
      "title": { "$first": "$title" },
      "description": { "$first": "$description" },
      "tags": { "$push": "$tags" }
    }}
  ],
  function(err, result) {
    // "tags" is now filtered by condition and "joined"
  }
)

注意这里的 .collection.name 实际上计算为“字符串”，即分配给模型的 MongoDB 集合的实际名称。由于默认情况下 mongoose 会“复数”集合名称，并且 $lookup 需要实际的 MongoDB 集合名称作为参数（因为它是服务器操作），所以这是在 mongoose 代码中使用的一个方便技巧，而不是“硬编码”集合直接命名。

虽然我们也可以在数组上使用 $filter 来删除不需要的项目，但这实际上是最有效的形式，因为 Aggregation Pipeline Optimization 用于 as $lookup 的特殊条件，然后是 $unwind 和 $match 条件.

这实际上导致三个流水线阶段合二为一：

   { "$lookup" : {
     "from" : "itemtags",
     "as" : "tags",
     "localField" : "tags",
     "foreignField" : "_id",
     "unwinding" : {
       "preserveNullAndEmptyArrays" : false
     },
     "matching" : {
       "tagName" : {
         "$in" : [
           "funny",
           "politics"
         ]
       }
     }
   }}

这是高度优化的，因为实际操作“首先过滤要加入的集合”，然后返回结果并“展开”数组。两种方法都被使用，因此结果不会打破 16MB 的 BSON 限制，这是客户端没有的约束。

唯一的问题是它在某些方面似乎“违反直觉”，特别是当您想要数组中的结果时，但这就是 $group 在这里的用途，因为它重构为原始文档形式。

同样不幸的是，我们此时根本无法以服务器使用的相同最终语法实际编写 $lookup。恕我直言，这是一个需要纠正的疏忽。但就目前而言，简单地使用序列就可以了，并且是具有最佳性能和可扩展性的最可行的选择。

附录 - MongoDB 3.6 及更高版本

尽管此处显示的模式相当优化，因为其他阶段如何进入 $lookup，但它确实有一个失败之处在于“LEFT JOIN”通常是两个 $lookup 所固有的并且 populate() 的操作被此处 $unwind 的 “最佳” 用法所否定，它不保留空数组。您可以添加 preserveNullAndEmptyArrays 选项，但这会否定上述 “优化” 序列，并且基本上保留所有三个阶段，这些阶段通常会在优化中组合。

MongoDB 3.6 扩展了 $lookup 的“更具表现力” 形式，允许“子管道”表达式。这不仅满足了保留“LEFT JOIN”的目标，而且还允许优化查询以减少返回的结果，并且语法大大简化：

Item.aggregate([
  { "$lookup": {
    "from": ItemTags.collection.name,
    "let": { "tags": "$tags" },
    "pipeline": [
      { "$match": {
        "tags": { "$in": [ "politics", "funny" ] },
        "$expr": { "$in": [ "$_id", "$$tags" ] }
      }}
    ]
  }}
])

用于将声明的“本地”值与“外部”值匹配的 $expr 实际上是 MongoDB 现在使用原始 $lookup 语法“内部”执行的操作。通过以这种形式表达，我们可以自己定制“子管道”中的初始 $match 表达式。

事实上，作为一个真正的“聚合管道”，您几乎可以在这个“子管道”表达式中使用聚合管道执行任何操作，包括将 $lookup 的级别“嵌套”到其他相关集合。

进一步的使用有点超出了这里的问题的范围，但就“嵌套人口”而言，$lookup 的新使用模式允许它大致相同，并且 "lot"< /em> 在它的充分使用中更强大。

工作示例

下面给出一个在模型上使用静态方法的例子。一旦实现了该静态方法，调用就变成了：

Item.lookup( { path: 'tags', query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } } }, callback )

或者增强为更现代一点，甚至变成：

let results = await Item.lookup({ path: 'tags', query: { 'tagName' : { '$in': [ 'funny', 'politics' ] } } })

使其在结构上与 .populate() 非常相似，但它实际上是在服务器上进行连接。为了完整起见，这里的用法根据父子案例将返回的数据转换回 mongoose 文档实例。

对于大多数常见情况，它相当简单且易于适应或仅按原样使用。

注意这里使用 async 只是为了简洁地运行随附的示例。实际的实现没有这种依赖。

const async = require('async'), mongoose = require('mongoose'), Schema = mongoose.Schema; mongoose.Promise = global.Promise; mongoose.set('debug', true); mongoose.connect('mongodb://localhost/looktest'); const itemTagSchema = new Schema({ tagName: String }); const itemSchema = new Schema({ dateCreated: { type: Date, default: Date.now }, title: String, description: String, tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] }); itemSchema.statics.lookup = function(opt,callback) { let rel = mongoose.model(this.schema.path(opt.path).caster.options.ref); let group = { "$group": { } }; this.schema.eachPath(p => group.$group[p] = (p === "_id") ? "$_id" : (p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` }); let pipeline = [ { "$lookup": { "from": rel.collection.name, "as": opt.path, "localField": opt.path, "foreignField": "_id" }}, { "$unwind": `$${opt.path}` }, { "$match": opt.query }, group ]; this.aggregate(pipeline,(err,result) => { if (err) callback(err); result = result.map(m => { m[opt.path] = m[opt.path].map(r => rel(r)); return this(m); }); callback(err,result); }); } const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema); function log(body) { console.log(JSON.stringify(body, undefined, 2)) } async.series( [ // Clean data (callback) => async.each(mongoose.models,(model,callback) => model.remove({},callback),callback), // Create tags and items (callback) => async.waterfall( [ (callback) => ItemTag.create([{ "tagName": "movies" }, { "tagName": "funny" }], callback), (tags, callback) => Item.create({ "title": "Something","description": "An item", "tags": tags },callback) ], callback ), // Query with our static (callback) => Item.lookup( { path: 'tags', query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } } }, callback ) ], (err,results) => { if (err) throw err; let result = results.pop(); log(result); mongoose.disconnect(); } )

或者更现代一点的 Node 8.x 及更高版本，带有 async/await 并且没有额外的依赖项：

const { Schema } = mongoose = require('mongoose'); const uri = 'mongodb://localhost/looktest'; mongoose.Promise = global.Promise; mongoose.set('debug', true); const itemTagSchema = new Schema({ tagName: String }); const itemSchema = new Schema({ dateCreated: { type: Date, default: Date.now }, title: String, description: String, tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] }); itemSchema.statics.lookup = function(opt) { let rel = mongoose.model(this.schema.path(opt.path).caster.options.ref); let group = { "$group": { } }; this.schema.eachPath(p => group.$group[p] = (p === "_id") ? "$_id" : (p === opt.path) ? { "$push": `$${p}` } : { "$first": `$${p}` }); let pipeline = [ { "$lookup": { "from": rel.collection.name, "as": opt.path, "localField": opt.path, "foreignField": "_id" }}, { "$unwind": `$${opt.path}` }, { "$match": opt.query }, group ]; return this.aggregate(pipeline).exec().then(r => r.map(m => this({ ...m, [opt.path]: m[opt.path].map(r => rel(r)) }) )); } const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema); const log = body => console.log(JSON.stringify(body, undefined, 2)); (async function() { try { const conn = await mongoose.connect(uri); // Clean data await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove())); // Create tags and items const tags = await ItemTag.create( ["movies", "funny"].map(tagName =>({ tagName })) ); const item = await Item.create({ "title": "Something", "description": "An item", tags }); // Query with our static const result = (await Item.lookup({ path: 'tags', query: { 'tags.tagName' : { '$in': [ 'funny', 'politics' ] } } })).pop(); log(result); mongoose.disconnect(); } catch (e) { console.error(e); } finally { process.exit() } })()

从 MongoDB 3.6 及更高版本开始，即使没有 $unwind 和 $group 构建：

const { Schema, Types: { ObjectId } } = mongoose = require('mongoose'); const uri = 'mongodb://localhost/looktest'; mongoose.Promise = global.Promise; mongoose.set('debug', true); const itemTagSchema = new Schema({ tagName: String }); const itemSchema = new Schema({ title: String, description: String, tags: [{ type: Schema.Types.ObjectId, ref: 'ItemTag' }] },{ timestamps: true }); itemSchema.statics.lookup = function({ path, query }) { let rel = mongoose.model(this.schema.path(path).caster.options.ref); // MongoDB 3.6 and up $lookup with sub-pipeline let pipeline = [ { "$lookup": { "from": rel.collection.name, "as": path, "let": { [path]: `$${path}` }, "pipeline": [ { "$match": { ...query, "$expr": { "$in": [ "$_id", `$$${path}` ] } }} ] }} ]; return this.aggregate(pipeline).exec().then(r => r.map(m => this({ ...m, [path]: m[path].map(r => rel(r)) }) )); }; const Item = mongoose.model('Item', itemSchema); const ItemTag = mongoose.model('ItemTag', itemTagSchema); const log = body => console.log(JSON.stringify(body, undefined, 2)); (async function() { try { const conn = await mongoose.connect(uri); // Clean data await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove())); // Create tags and items const tags = await ItemTag.insertMany( ["movies", "funny"].map(tagName => ({ tagName })) ); const item = await Item.create({ "title": "Something", "description": "An item", tags }); // Query with our static let result = (await Item.lookup({ path: 'tags', query: { 'tagName': { '$in': [ 'funny', 'politics' ] } } })).pop(); log(result); await mongoose.disconnect(); } catch(e) { console.error(e) } finally { process.exit() } })()

我不再使用 Mongo / Mongoose，但我接受了你的回答，因为这是一个受欢迎的问题，看起来这对其他人有帮助。很高兴看到这个问题现在有一个更具可扩展性的解决方案。感谢您提供更新的答案。

谢谢<3，，，，

a

aaronheckmann

不直接支持您要求的内容，但可以通过在查询返回后添加另一个过滤步骤来实现。

首先，.populate( 'tags', null, { tagName: { $in: ['funny', 'politics'] } } ) 绝对是您过滤标签文档所需要做的。然后，在查询返回后，您需要手动过滤掉没有任何符合填充条件的 tags 文档的文档。就像是：

query.... .exec(function(err, docs){ docs = docs.filter(function(doc){ return doc.tags.length; }) // do stuff with docs });

嘿亚伦，谢谢你的回复。我可能错了，但 populate() 上的 $in 不会只填充匹配的标签吗？因此，该项目上的任何其他标签都将被过滤掉。听起来我必须填充所有项目，然后让第二个过滤步骤根据标签名称减少它。

@aaronheckmann 我已经实现了您建议的解决方案，您将在 .exec 之后进行过滤，因为尽管填充查询仅填充所需的对象，但仍返回整个数据集。您是否认为在较新版本的 Mongoose 中有一些选项可以仅返回填充的数据集，因此我们不需要进行其他过滤？

我也很想知道性能，如果查询最后返回整个数据集，那么没有进行人口过滤的目的吗？你说什么？我正在调整人口查询以进行性能优化，但是这样对于大型数据集的性能不会变得更好？

如果其他人有兴趣，mongoosejs.com/docs/api.html#query_Query-populate 会提供所有详细信息

填充时如何在不同字段中匹配？

S

Sergio Tulentsev

尝试更换

.populate('tags').where('tags.tagName').in(['funny', 'politics'])

经过

.populate( 'tags', null, { tagName: { $in: ['funny', 'politics'] } } )

谢谢回复。我相信这样做只会用有趣或政治来填充每个项目，这不会减少父列表。我真正想要的只是标签中有有趣或政治的物品。

你能展示一下你的文档是什么样子的吗？因为 tags 数组中的 'where' 对我来说似乎是一个有效的操作..我们只是语法错误..您是否尝试过完全删除该 'where' 子句并检查是否返回了任何内容？或者，只是为了测试编写 'tags.tagName' 在语法上是否可行，您可能会暂时忘记 ref 的内容，并尝试在 'Item' 文档中使用嵌入式数组进行查询。

用文档编辑了我的原始帖子。我能够成功地将模型作为 Item 内的嵌入式数组进行测试，但不幸的是，我要求它是 DBRef，因为 ItemTag 经常更新。再次感谢您的帮助。

F

Fabian

更新：请查看评论 - 这个答案与问题不正确匹配，但也许它回答了用户遇到的其他问题（我认为这是因为赞成票）所以我不会删除这个“答案”：

第一：我知道这个问题真的过时了，但我搜索了这个问题，这个 SO 帖子是谷歌条目 #1。所以我实现了 docs.filter 版本（接受的答案），但正如我在 mongoose v4.6.0 docs 中所读到的，我们现在可以简单地使用：

Item.find({}).populate({ path: 'tags', match: { tagName: { $in: ['funny', 'politics'] }} }).exec((err, items) => { console.log(items.tags) // contains only tags where tagName is 'funny' or 'politics' })

希望这对未来的搜索机用户有所帮助。

但这肯定只会过滤 items.tags 数组吗？无论标签名称如何，项目都将被退回...

没错，@OllyBarca。根据文档，匹配仅影响人口查询。

我认为这不能回答问题

@Fabian 这不是错误。只有人口查询（在本例中为 fans）被过滤。返回的实际文档（即 Story，包含 fans 作为属性）不受影响或被过滤。

因此，由于评论中提到的原因，这个答案是不正确的。以后看到这个的任何人都应该小心。

O

OllyBarca

最近自己遇到了同样的问题后，我想出了以下解决方案：

首先，查找 tagName 为 'funny' 或 'politics' 的所有 ItemTags，并返回 ItemTag _ids 数组。

然后，在 tags 数组中找到包含所有 ItemTag _id 的 Items

ItemTag .find({ tagName : { $in : ['funny','politics'] } }) .lean() .distinct('_id') .exec((err, itemTagIds) => { if (err) { console.error(err); } Item.find({ tag: { $all: itemTagIds} }, (err, items) => { console.log(items); // Items filtered by tagName }); });

我是怎么做到的 const tagsIds = await this.tagModel .find({ name: { $in: tags } }) .lean() .distinct('_id');返回 this.adviceModel.find({ tags: { $all: tagsIds } });

H

HernanFila

@aaronheckmann 's answer 对我有用，但我必须将 return doc.tags.length; 替换为 return doc.tags != null;，因为该字段包含 null 如果它与填充内写入的条件不匹配。所以最后的代码：

query.... .exec(function(err, docs){ docs = docs.filter(function(doc){ return doc.tags != null; }) // do stuff with docs });

关注公众号

不定期副业成功案例分享

想领先一步获取最新的外包任务吗？
立即订阅

相似问题

如何在 Node.js 中使用 Mongoose 进行分页？

如何在 Mongoose 中更新/插入文档？

mongodb/mongoose findMany - 查找 ID 列在数组中的所有文档

比较猫鼬_id和字符串

MongoDB/Mongoose 在特定日期查询？

Mongoose 中的“__v”字段是什么

保存后猫鼬填充

在猫鼬中填充嵌套数组

E11000 mongodb mongoose 中的重复键错误索引

Mongoose：findOneAndUpdate 不返回更新的文档

在 Mongoose 中填充后查询

关注公众号

想领先一步获取最新的外包任务吗？

相似问题

平台

支持

友情链接

联系我们