Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: added new Lua API for Nginx core's dynamic resolver (FFI). #235

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

slimboyfat
Copy link

I hereby granted the copyright of the changes in this pull request
to the authors of this lua-resty-core project.

@slimboyfat slimboyfat changed the title feature: added new Lua API for Nginx core's dynamic resolver. feature: added new Lua API for Nginx core's dynamic resolver (FFI). Mar 6, 2019
lib/ngx/resolver.md Outdated Show resolved Hide resolved
lib/ngx/resolver.md Show resolved Hide resolved
t/resolver.t Outdated Show resolved Hide resolved
lib/ngx/resolver.lua Outdated Show resolved Hide resolved
lib/ngx/resolver.lua Outdated Show resolved Hide resolved
ngx_int_t rc;
unsigned ipv4:1;
unsigned ipv6:1;
} ngx_http_lua_resolver_ctx_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to expose such complex C struct as part of the ABI. It's quite a maintenance burden. Better avoid it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. But such struct allows avoid overhead of mem-pool creation/freeing + additional mem-allocations on C-land. So, it is just a trade-off between maintenance burden and additional cpu cycles.

What do you think?

lib/ngx/resolver.lua Outdated Show resolved Hide resolved
res, err = nil, "unknown error"
end

C.ngx_http_lua_ffi_resolver_destroy(ffi_gc(ctx, nil))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. ffi.gc() is for memory blocks allocated on the C land. The ctx is allocated on the Lua land and thus is managed by Lua GC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain my idea in step-by-step:

  1. At C land I defined a finalizer which guarantees finalization of resolver's context allocated during a resolve call.

  2. The finalizer is registered via __gc metamethod for FFI cdata object Ctx.

At this point we have a guarante that the finalizer will be eventually called, as http://luajit.org/ext_ffi_api.html states that:

A cdata finalizer works like the __gc metamethod for userdata objects: when the last reference to a cdata object is gone, the associated finalizer is called with the cdata object as an argument.
...
An existing finalizer can be removed by setting a nil finalizer, e.g. right before explicitly deleting a resource:
...

But in Lua 5.1 manual (https://www.lua.org/manual/5.1/manual.html)
at "2.10.1 – Garbage-Collection Metamethods" we can find following note:

At the end of each garbage-collection cycle, the finalizers for userdata are called in reverse order of their creation, among those collected in that cycle. That is, the first finalizer to be called is the one associated with the userdata created last in the program. The userdata itself is freed only in the next garbage-collection cycle.

Thus object with finalizer requires 2 GC cycles (1st GC-cycle - call finalizer, 2nd cycle - memory reclamation) to get fully freed.
That is why, in my code, I make a third step.

  1. Perform explicit finalizator call and then unregister __gc finalizer.

C.ngx_http_lua_ffi_resolver_destroy(ffi_gc(ctx, nil))
is equvivalent for

C.ngx_http_lua_ffi_resolver_destroy(ctx) -- explicitly call finalizer
ffi_gc(ctx, nil) -- unregister finalizer (set __gc = nil at ctx's mt)

Now, ctx object will be freed in a single GC-cycle. So, from my point of view, it is just an optimization, which makes Lua heap cleaner a bit faster.

Please, let me know where I was wrong...

P.S.: I found several places in the OpenResty source code where the same idiom was used (e.g.: regex.lua - ngx_lua_ffi_destroy_regex(ffi_gc(compiled, nil)))

ctx.ipv4 = ipv4 == nil and 1 or ipv4
ctx.ipv6 = ipv6 == nil and 0 or ipv6

local rc = C.ngx_http_lua_ffi_resolve(ctx, hostname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know that ctx is managed by Lua GC while your ctx may get collected as soon as this Lua function returns (or yields). I see you register ctx to rctx which will surely outlive this Lua function call once it yields.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really believed that coroutine function locals must not be collected by GC, until function returns (yield is not a case when local might be get freed by GC).

I spent several hours for reading manual and googling. Unfortunately, I did not find anything that clearly states what should be during a yield with locals.

Based on following I wrote several tests.
Lua: test suites https://www.lua.org/tests/ (for v5.1)
Lua Traverse http://code.matthewwild.co.uk/luatraverse/

local object_was_finalized

local function finalizer()
    object_was_finalized = true
end

local function func()
    local object = newproxy(true)
    getmetatable(object).__gc = finalizer
    local free = coroutine.yield()
    if free then
        object = nil
    end
    coroutine.yield()
end

-- Test #1
object_was_finalized = false
local t = coroutine.create(func)
coroutine.resume(t)
collectgarbage()
assert(object_was_finalized == false)
coroutine.resume(t, true)
collectgarbage()
assert(object_was_finalized == true)

-- Test #2 
object_was_finalized = false
t = coroutine.create(func)
coroutine.resume(t)
collectgarbage()
assert(object_was_finalized == false)
coroutine.resume(t, false)
collectgarbage()
assert(object_was_finalized == false)
coroutine.resume(t)
collectgarbage()
assert(object_was_finalized == true)

... and still not sure, if local variable can be GC-ed on yield.

Could you please point me to manual/link which covers this (GC on yield).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I created a test with explicit lua_gc calls (from C-land) just before return into yielded-lua-thread and right after the return from resumed thread.
The test confirmed assertion that Ctx can get freed only after return from coroutine (not in the middle on yield).

Please, give me a direction/additional info on this.

@slimboyfat slimboyfat force-pushed the feature/ngx-resolve branch from f1b0825 to 77f1630 Compare March 14, 2019 20:32
@mikz
Copy link

mikz commented Apr 17, 2019

This looks useful, but maybe it would be better to design it more future-proof? What about using this for custom load balancing in balancer_by_lua? This would need to return all the resolved addresses, so the balancer code can do the decision. What about other classes than A records? SRV for example?

If this API exists, I think it should aim to replace the need for https://github.com/openresty/lua-resty-dns.

@slimboyfat
Copy link
Author

slimboyfat commented Apr 18, 2019

Yes, it can be extended to retrieve all entries from DNS response and return Lua table. At the moment, only A and AAAA records are supported, but add support for SRV-records is not a big deal (as Nginx internal resolver supports SRV queries).

Benefits of proposed ngx.resolve() API are

  • configurable caching policy;
  • ability to efficiently handle sequential queries for a same host;

@agentzh had several remarks on correctness of proposed implementation, but he possible is busy and can not help me.

@agentzh, @thibaultcha, can you help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants