2006-09-02 weird DNS problem

from HTYP, the free directory anyone can edit if they can prove to me that they're not a spambot
Jump to: navigation, search

Original Problem

I can't tell if this means I'm being hacked somehow or if it's just a network glitch at EarthLink, but at essentially random times Samba will start returning weird IP addresses for local machines on the network. The addresses are consistently within a particular range, though the exact assignments seem to change.

My main worry is that this means my network traffic is being routed via some hacker's machine (which would be consistent with some odd delays and errors loading web pages) which might then allow passwords and such to be picked up.

Some quick pastes:

From gonzo (KUbuntu Dapper):

  • net lookup gonzo: 192.168.0.103
  • net lookup bunsen: 209.86.66.92
  • net lookup beaker: 209.86.66.91
  • net lookup mokey: 209.86.66.93
  • net lookup floyd: 192.168.0.110
  • net lookup melorr: 209.86.66.90

Similar (but not identical) results doing "ping" to various machines from Beaker (Win98), but everything went back to normal when I rebooted it.

  • traceroute 209.86.66.92
traceroute to 209.86.66.92 (209.86.66.92), 30 hops max, 40 byte packets
1 192.168.0.1 (192.168.0.1) 0.798 ms 0.365 ms 0.237 ms
2 10.40.64.1 (10.40.64.1) 7.091 ms 6.291 ms 11.775 ms
3 srp8-0.rlghnca-rtr1.nc.rr.com (24.25.2.161) 8.231 ms 6.120 ms 7.064 ms
4 pos14-0.rlghncrdc-rtr1.nc.rr.com (24.25.0.5) 8.592 ms 7.250 ms 7.695 ms
5 pos12-0.rlghncrdc-rtr2.nc.rr.com (24.93.64.37) 8.393 ms 7.100 ms 6.096 ms
6 tenge-1-4.car1.Raleigh1.Level3.net (4.71.160.1) 23.241 ms 21.624 ms 22.056 ms
7 ae-11-11.car2.Raleigh1.Level3.net (4.69.132.174) 79.548 ms 21.611 ms 21.724 ms
8 ae-6-6.ebr2.Washington1.Level3.net (4.69.132.178) 27.750 ms * 31.726 ms
9 ae-24-56.car4.Washington1.Level3.net (4.68.121.177) 85.176 ms ae-24-52.car4.Washington1.Level3.net (4.68.121.49) 24.588 ms ae-24-54.car4.Washington1.Level3.net (4.68.121.113) 28.596 ms
10 unknown.Level3.net (64.158.57.50) 19.989 ms 18.623 ms 24.089 ms
11 bor02-so-3-1.ga-atlanta0.ne.earthlink.net (209.165.110.73) 24.023 ms 21.807 ms 28.283 ms
12 bor01-ge-1-0-0.ga-atlanta1.ne.earthlink.net (209.165.110.106) 24.091 ms 22.520 ms 24.604 ms
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *

nslookup 209.86.66.92

Server:         192.168.0.1
Address:        192.168.0.1#53

Non-authoritative answer:
92.66.86.209.in-addr.arpa       name = elydm.03.am.barefruit.com.

woozle@gonzo:~$ nslookup 209.86.66.91

Server:         192.168.0.1
Address:        192.168.0.1#53

** server can't find 91.66.86.209.in-addr.arpa: SERVFAIL

woozle@gonzo:~$ nslookup 209.86.66.91

Server:         192.168.0.1
Address:        192.168.0.1#53

Non-authoritative answer:
91.66.86.209.in-addr.arpa       name = elydm.02.am.barefruit.com.
  • 209.86.66.90: elydm.01.am.barefruit.com
  • 209.86.66.91: elydm.02.am.barefruit.com
  • 209.86.66.92: elydm.03.am.barefruit.com

2006-09-04 More information

C:\WINDOWS>tracert floyd

Tracing route to floyd.earthlink.net [209.86.66.94] over a maximum of 30 hops:

 1   <10 ms   <10 ms   <10 ms  192.168.0.1
 2     7 ms    12 ms     6 ms  10.40.64.1
 3     7 ms     8 ms    11 ms  srp8-0.rlghnca-rtr2.nc.rr.com [24.25.2.163]
 4     9 ms     7 ms     7 ms  pos14-0.rlghncrdc-rtr2.nc.rr.com [24.25.0.9]
 5    12 ms    13 ms    13 ms  son1-0-1.chrlncsa-rtr6.carolina.rr.com [24.93.64.81]
 6    12 ms    13 ms    11 ms  pop1-cha-P4-0.atdn.net [66.185.132.45]
 7    12 ms    12 ms    12 ms  bb1-cha-P3-0.atdn.net [66.185.138.64]
 8    17 ms    17 ms    17 ms  bb1-atm-P6-0.atdn.net [66.185.152.182]
 9    17 ms    17 ms    18 ms  pop1-atm-P0-0.atdn.net [66.185.147.193]
10    17 ms    17 ms    18 ms  Earthlink.atdn.net [66.185.150.6]
11    16 ms    17 ms    17 ms  floyd.earthlink.net [209.86.66.94]

2006-09-08 More logs

A series of "net lookup"s, all done within less than a minute:

woozle@gonzo:~$ net lookup floyd -l
209.86.66.90
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101
woozle@gonzo:~$ net lookup floyd -l
209.86.66.91
woozle@gonzo:~$ net lookup floyd -l
209.86.66.91
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101
woozle@gonzo:~$ net lookup floyd -l
209.86.66.92
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101
woozle@gonzo:~$ net lookup floyd -l
209.86.66.92
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101
woozle@gonzo:~$ net lookup floyd -l
209.86.66.93
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101
woozle@gonzo:~$ net lookup floyd -l
192.168.0.101


Also, phealy says:

i'm betting your problem is that, say, floyd hasn't advertised itself recently and samba can't find it ... is there something in there (referring to resolv.conf) like 'search earthlink.net'?

@gonzo:~$ cat /etc/resolv.conf
search earthlink.net
nameserver 192.168.0.1

yea ... so when it can't find it, it looks to dns ... that search line means "if you can't find 'floyd', try 'floyd.earthlink.net' ... the delays are, respectively, WINS timing out, and then 'floyd' timing out

<TheWoozle> Ok, that makes sense... but why did it start so abruptly? ... We've certainly had machines come and go on the LAN, and never had weird addresses pop up. If it couldn't find a machine, it would just say so.
<phealy> a setting change, the machine acting sas the WINS server went, etc. ... actually, did you change your router? ... or your ISP could have changed their DNS settings
<TheWoozle> Hmm... the router did get a firmware upgrade not long ago.
<phealy> probably that search line wasn't there before and now is
<TheWoozle> And if that search line was on the Samba master browser, it might cause other machines to act similarly?
<phealy> no ... but all machines work that way ... and they're all getting it from the router's DHCP ... if it can't find it via netbios, it looks for DNS
<TheWoozle> The router's DHCP listed the correct address for Floyd.
<phealy> right
<TheWoozle> I sshed to Floyd using the address given by the router; that worked. ssh to Floyd using the net lookup address failed.
<phealy> but the router's dhcp gives them all the searchpath, and they're looking at earthlink.net when they can't find the machine via SMB

Well, that would explain why saying "ping netbiosname" now produces something other than "unknown host netbiosname":

woozle@gonzo:~$ ping floyd
PING floyd.earthlink.net (209.86.66.93) 56(84) bytes of data.
64 bytes from elydm.04.am.barefruit.com (209.86.66.93): icmp_seq=1 ttl=54 time=20.5 ms
64 bytes from elydm.04.am.barefruit.com (209.86.66.93): icmp_seq=2 ttl=54 time=23.4 ms
64 bytes from elydm.04.am.barefruit.com (209.86.66.93): icmp_seq=3 ttl=54 time=17.2 ms
--- floyd.earthlink.net ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2009ms
rtt min/avg/max/mdev = 17.234/20.401/23.441/2.538 ms
woozle@gonzo:~$ ping bunsen
PING bunsen.earthlink.net (209.86.66.95) 56(84) bytes of data.
64 bytes from elydm.06.am.barefruit.com (209.86.66.95): icmp_seq=1 ttl=54 time=27.4 ms
64 bytes from elydm.06.am.barefruit.com (209.86.66.95): icmp_seq=2 ttl=54 time=31.1 ms
--- bunsen.earthlink.net ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1004ms
rtt min/avg/max/mdev = 27.408/29.278/31.148/1.870 ms
woozle@gonzo:~$ ping beaker
ping: unknown host beaker
woozle@gonzo:~$ net lookup beaker
209.86.66.92
woozle@gonzo:~$ ping beaker
PING beaker.earthlink.net (209.86.66.91) 56(84) bytes of data.
64 bytes from elydm.02.am.barefruit.com (209.86.66.91): icmp_seq=1 ttl=54 time=23.3 ms
64 bytes from elydm.02.am.barefruit.com (209.86.66.91): icmp_seq=2 ttl=54 time=23.0 ms
--- beaker.earthlink.net ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1007ms
rtt min/avg/max/mdev = 23.046/23.184/23.323/0.205 ms

Next: how to verify that this is happening because of the router, and then how to make it stop.

2006-09-09 analysis

  • DHCP seems to be fetching any locally-unresolved addresses (i.e. addresses which the router can't resolve) from Earthlink's servers.
  • Query: why are they unresolved? Check to see what addresses those actual machines think they have.
    • Answer: ifconfig on Bunsen reveals correct IP (192.168.0.106) even though "net lookup bunsen" on gonzo returns 209.86.66.94
    • Note: DHCP listing on router has 192.168.0.106 listed as "unknown"...
    • Therefore: seems likely router is not recognizing the name "bunsen", and replies that it does not know bunsen's address
  • Query: why isn't the router receiving Bunsen's name in the DHCP request? Or, if it is receiving it, why isn't it storing it?
    • Subquery: how can I tell if Bunsen is correctly transmitting its name when it does a DHCP?
  • Flaw in theory: router's DHCP list shows 192.168.0.105 for Mokey, but "net lookup mokey" returns 209.86.66.92 on bunsen and 209.86.66.94 on gonzo. (Refreshed view of router's DHCP list just to be sure; no apparent changes.) At the moment, these addresses are being persistent.
    • Query: how to get a debug/trace log of "net lookup"'s activity? Do we have to use ethereal?
    • Answer: no. "net -d 2 lookup machinename" returns more info:
    woozle@gonzo:~$ net -d 2 lookup mokey
    [2006/09/09 10:27:45, 2] lib/interface.c:add_interface(81)
    added interface ip=192.168.0.103 bcast=192.168.0.255 nmask=255.255.255.0
    [2006/09/09 10:27:45, 2] libsmb/namequery.c:name_query(492)
    Got a positive name query response from 127.0.0.1 ( 209.86.66.94 )
    209.86.66.94
    [2006/09/09 10:27:45, 2] utils/net.c:main(878)
    return code = 0
    • More: a net lookup on bunsen was less illuminating at first; I had to go up to -d 5 before I got this bit at the end:
    Netbios name list:-
    my_netbios_names[0]="BUNSEN"
    [2006/09/09 10:31:13, 2] lib/interface.c:add_interface(81)
    added interface ip=192.168.0.106 bcast=192.168.0.255 nmask=255.255.255.0
    [2006/09/09 10:31:13, 5] lib/gencache.c:gencache_init(60)
    Opening cache file at /var/cache/samba/gencache.tdb
    [2006/09/09 10:31:13, 5] libsmb/namecache.c:namecache_fetch(201)
    name mokey#20 found.
    209.86.66.92
    [2006/09/09 10:31:13, 2] utils/net.c:main(988)
    return code = 0
    • It looks like the incorrect address for Mokey has been cached (in "my_netbios_names" in Bunsen). Query: how to clear/refresh the cache?
    • Discovery: nmblookup machinename apparently refreshes the cache some of the time. (It also will return a non-127.0.0 address for the localhost.) However, it doesn't seem to be solving the problem on gonzo:
    woozle@gonzo:~$ nmblookup mokey
    querying mokey on 192.168.0.255
    192.168.0.105 mokey<00>
    woozle@gonzo:~$ net lookup mokey
    209.86.66.94
    woozle@gonzo:~$ nmblookup mokey
    querying mokey on 192.168.0.255
    192.168.0.105 mokey<00>
    woozle@gonzo:~$ net lookup mokey
    209.86.66.94

2006-09-10 Resolution

The problem appears to have been caused in large part by EarthLink's new "dead domain handling" scheme, possibly exacerbated by my DLink DI-604's sloppy handling of NetBIOS names, which in turn may have been exacerbated by a recent firmware upgrade I did to it (although that should have improved things, of course).

We have turned off the DLink router's DHCP and are now using dnsmasq, using non-EarthLink upstream DNS servers (Xmission and IODynamics (Tenebram's employer)). Tene also made a number of other little tweaks, including one which now resolves NetBIOS names as effectively as regular Internet domain names, so (e.g.) I can now "ping bunsen".

2006-09-24 related problem?

For some reason, beaker suddenly wasn't picking up DHCP from Gonzo; dnsmasq (on Gonzo) might have gotten turned off accidentally when I restarted Samba, and that may have gotten Beaker all confused. ipconfig/renew_all didn't work; it actually reported that no DHCP server was found and reverted to using the hardware router (192.168.0.1) for DNS, rather than coming up with an autoconfig IP address (I checked, and the router's DHCP was still turned off, so I don't know how it worked that one out).

What eventually worked was:

  • ipconfig/release_all on Beaker
  • shut down Beaker
  • wait 5 seconds, then press Beaker's power button to turn him back on

The network started appearing normally after that. ipconfig/all confirms that Gonzo (192.168.0.254) is now the DHCP server.